explorica.reports.presets

explorica.reports.presets.data_overview

Short data overview presets for Explorica reports.

This module provides utilities to quickly build a short overview of a dataset, including basic statistics, data shape, and a brief data quality summary. It is intended for use with the Explorica reports framework.

Functions

get_data_overview_blocks(data, round_digits=4)

Build blocks for a short data overview. Returns a list of Block instances containing basic statistics, data shape, and data quality overview.

get_data_overview_report(data, round_digits=4)

Generate a short data overview report. Returns a Report instance composed of the blocks returned by get_data_overview_blocks.

Notes

  • These functions are designed for quick, high-level EDA and should not be used as a replacement for full data quality or interactions analysis.

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_data_overview_report
>>> df = pd.DataFrame({'a': [1, 2, 3], 'b': ['x', 'y', 'z']})
>>> report = get_data_overview_report(df)
>>> report.title
'Data overview'
>>> len(report.blocks)
3
explorica.reports.presets.data_overview.get_data_overview_blocks(data: Sequence[Any] | Mapping[str, Sequence[Any]], round_digits: int = 4, nan_policy: str | Literal['drop', 'raise', 'include'] = 'drop') list[Block][source]

Build blocks for a short data overview.

This function constructs a sequence of Explorica Block objects that together provide a concise exploratory overview of a dataset. The resulting blocks can be combined with other blocks before rendering.

The overview includes:

  • basic descriptive statistics,

  • dataset shape and data types,

  • a brief data quality summary.

Parameters:
dataSequence[Any] or Mapping[str, Sequence[Any]]

Input dataset. Can be any structure convertible to a pandas DataFrame, such as a dictionary of sequences or a sequence of records.

round_digitsint, default=4

Number of decimal digits to use when rounding numerical statistics.

nan_policy{‘drop’, ‘raise’, ‘include’}, default=’drop’

Policy for handling missing values in the input data.

  • ‘drop’ : rows containing missing values are removed before analysis.

  • ‘raise’ : an error is raised if missing values are present.

  • ‘include’ : missing values are preserved where supported.

Note that not all child blocks support nan_policy=’include’. In such cases, the policy is internally downgraded to ‘drop’ for those blocks.

Returns:
list of Block

A list of Explorica blocks representing a short data overview.

Notes

  • This function does not return a Report instance and does not perform any rendering. It is intended for compositional use, allowing users to merge the returned blocks with other presets before building a final report.

  • The data shape block is always computed with missing values included (nan_policy=’include’), to ensure that structural metrics reflect the full dataset. This is semantically correct in the context of exploratory data analysis (EDA), even if other blocks respect the provided nan_policy.

  • During the construction of EDA or interaction reports, many matplotlib figures may be opened (one per plot or table visualization). This is expected behavior when the dataset contains many features.

  • To prevent runtime warnings about too many open figures, these warnings are ignored internally.

  • To free memory after rendering, it is recommended to explicitly close figures:

    report = get_eda_report(df)
    report.render()
    report.close_figures()
    

    Or for individual blocks:

    block.close_figures()
    

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_data_overview_blocks
>>> # Simple usage
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1],
... })
>>> blocks = get_data_overview_blocks(df)
>>> len(blocks)
3
>>> blocks[0].block_config.title
'Basic statistics for the dataset'
explorica.reports.presets.data_overview.get_data_overview_report(data: Sequence[Any] | Mapping[str, Sequence[Any]], round_digits: int = 4, nan_policy: str | Literal['drop', 'raise', 'include'] = 'drop') list[Block] | Report[source]

Generate a short data overview report.

This function creates a complete Explorica Report that provides a concise exploratory overview of a dataset. Internally, it builds the same blocks as get_data_overview_blocks and wraps them into a report ready for rendering (e.g., to HTML or PDF).

Parameters:
dataSequence[Any] or Mapping[str, Sequence[Any]]

Input dataset. Can be any structure convertible to a pandas DataFrame, such as a dictionary of sequences or a sequence of records.

round_digitsint, default 4

Number of decimal digits to use when rounding numerical statistics.

nan_policy{‘drop’, ‘raise’, ‘include’}, default=’drop’

Policy for handling missing values in the input data.

  • ‘drop’ : rows containing missing values are removed before analysis.

  • ‘raise’ : an error is raised if missing values are present.

  • ‘include’ : missing values are preserved where supported.

Note that not all child blocks support nan_policy=’include’. In such cases, the policy is internally downgraded to ‘drop’ for those blocks.

Returns:
Report

An Explorica report containing:

  • basic descriptive statistics,

  • dataset shape and data types,

  • a brief data quality summary.

Notes

  • This is a convenience wrapper around get_data_overview_blocks. Use get_data_overview_blocks instead if you need fine-grained control over block composition before report creation.

  • During the construction of EDA or interaction reports, many matplotlib figures may be opened (one per plot or table visualization). This is expected behavior when the dataset contains many features.

  • To prevent runtime warnings about too many open figures, these warnings are ignored internally.

  • To free memory after rendering, it is recommended to explicitly close figures:

    report = get_eda_report(df)
    report.render()
    report.close_figures()
    

    Or for individual blocks:

    block.close_figures()
    

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_data_overview_report
>>> # Simple usage
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1],
... })
>>> report = get_data_overview_report(df)
>>> report.title
'Data overview'
>>> len(report.blocks)
3
>>> report.close_figures()

explorica.reports.presets.data_quality

Data quality presets for Explorica reports.

Provides high-level preset builders for composing comprehensive data quality reports in Explorica. These presets orchestrate multiple specialized blocks into a single, ordered analysis pipeline suitable for exploratory data analysis (EDA).

Functions

get_data_quality_blocks(data, round_digits=4)

Build blocks for a detailed data quality analysis.

get_data_quality_report(data, round_digits=4)

Generate a detailed data quality analysis report.

Notes

  • This module acts as an orchestration layer and does not perform computations directly.

  • Each block included in the report is responsible for a distinct data quality dimension (e.g. cardinality, distributions, outliers).

  • The order of blocks is intentional and reflects a typical EDA flow, from coarse feature screening to more detailed statistical inspection.

  • All blocks share common formatting and rounding conventions via the round_digits parameter.

Examples

>>> import pandas as pd
>>> from explorica.reports.presets.data_quality import (
...     get_data_quality_blocks,
...     get_data_quality_report
... )
>>> df = pd.DataFrame({'a': [1, 2, 3], 'b': [3, 2, 1]})
>>> blocks = get_data_quality_blocks(df)
>>> len(blocks) > 0
True
>>> report = get_data_quality_report(df)
>>> report.title
'Data quality'
explorica.reports.presets.data_quality.get_data_quality_blocks(data: Sequence[Any] | Mapping[str, Sequence[Any]], round_digits: int = 4, nan_policy: str | Literal['drop_with_split', 'raise', 'include'] = 'drop_with_split') list[Block][source]

Build a set of blocks providing a detailed data quality analysis.

This function orchestrates multiple data quality blocks, including cardinality, distributions, and outliers, into a coherent sequence that can be used to assemble a full data quality report.

Parameters:
dataSequence[Any] or Mapping[str, Sequence[Any]]

Input dataset. Can be any structure convertible to a pandas DataFrame, such as a dictionary of sequences or a sequence of records.

round_digitsint, default 4

Number of decimal digits to use when rounding numerical statistics.

nan_policy{‘drop_with_split’, ‘raise’, ‘include’}, default=’drop_with_split’

Policy for handling missing values in the input data.

  • ‘drop_with_split’ : Missing values are handled independently for each feature. For every column, NaNs are dropped column-wise before computing statistics. As a result, different features may be evaluated on different numbers of observations. This behavior is semantically correct in an EDA context, where preserving per-feature statistics is preferred over strict row-wise alignment.

  • ‘raise’ : an error is raised if missing values are present.

  • ‘include’ : missing values are preserved where supported.

Note that not all child blocks support nan_policy=’include’. In such cases, the policy is internally downgraded to ‘drop’ for those blocks.

Returns:
list[Block]

A list of Block instances for data quality analysis.

Notes

  • During the construction of EDA or interaction reports, many matplotlib figures may be opened (one per plot or table visualization). This is expected behavior when the dataset contains many features.

  • To prevent runtime warnings about too many open figures, these warnings are ignored internally.

  • To free memory after rendering, it is recommended to explicitly close figures:

    report = get_eda_report(df)
    report.render()
    report.close_figures()
    

    Or for individual blocks:

    block.close_figures()
    

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_data_quality_blocks
>>> # Simple usage
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1],
... })
>>> blocks = get_data_quality_blocks(df)
>>> len(blocks)
3
explorica.reports.presets.data_quality.get_data_quality_report(data: Sequence[Any] | Mapping[str, Sequence[Any]], round_digits: int = 4, nan_policy: str | Literal['drop_with_split', 'raise', 'include'] = 'drop_with_split') Report[source]

Generate a full data quality report from multiple quality blocks.

This function orchestrates the assembly of a complete Report containing cardinality, distributions, and outlier analysis blocks. It provides an overview of the dataset’s quality characteristics in a structured format.

Parameters:
dataSequence[Any] or Mapping[str, Sequence[Any]]

Input dataset. Can be any structure convertible to a pandas DataFrame, such as a dictionary of sequences or a sequence of records.

round_digitsint, default 4

Number of decimal digits to use when rounding numerical statistics.

nan_policy{‘drop_with_split’, ‘raise’, ‘include’}, default=’drop_with_split’

Policy for handling missing values in the input data.

  • ‘drop_with_split’ : Missing values are handled independently for each feature. For every column, NaNs are dropped column-wise before computing statistics. As a result, different features may be evaluated on different numbers of observations. This behavior is semantically correct in an EDA context, where preserving per-feature statistics is preferred over strict row-wise alignment.

  • ‘raise’ : an error is raised if missing values are present.

  • ‘include’ : missing values are preserved where supported.

Note that not all child blocks support nan_policy=’include’. In such cases, the policy is internally downgraded to ‘drop’ for those blocks.

Returns:
Report

An Explorica report containing:

  • outlier summary (IQR and Z-score counts, zero/near-zero variance features),

  • distribution characteristics (skewness, kurtosis, normality flag, boxplots, histograms),

  • cardinality metrics (unique values, top value ratio, entropy, is_constant/is_unique flags).

Notes

  • This is a convenience wrapper around get_data_overview_blocks. Use get_data_overview_blocks instead if you need fine-grained control over block composition before report creation.

  • During the construction of EDA or interaction reports, many matplotlib figures may be opened (one per plot or table visualization). This is expected behavior when the dataset contains many features.

  • To prevent runtime warnings about too many open figures, these warnings are ignored internally.

  • To free memory after rendering, it is recommended to explicitly close figures:

    report = get_eda_report(df)
    report.render()
    report.close_figures()
    

    Or for individual blocks:

    block.close_figures()
    

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_data_quality_report
>>> # Simple usage
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1],
... })
>>> report = get_data_quality_report(df)
>>> report.title
'Data quality'
>>> len(report.blocks)
3
>>> report.close_figures()

explorica.reports.presets.eda

Exploratory Data Analysis (EDA) presets.

This module provides high-level orchestration functions for building Explorica exploratory data analysis (EDA) reports. It defines convenient entry points that assemble multiple lower-level analysis blocks into coherent EDA workflows.

The public API is designed for casual usage: users may specify feature groups and targets explicitly, or rely on heuristic inference based on data types and cardinality.

Functions

get_eda_blocks(data, numerical_names=None, categorical_names=None, target_name=None, **kwargs)

Build a full exploratory data analysis (EDA) report as a list of blocks.

get_eda_report(data, numerical_names=None, categorical_names=None, target_name=None, **kwargs)

Build a full exploratory data analysis (EDA) report.

Notes

  • Feature and target assignments can be provided explicitly or inferred automatically using heuristics.

  • Missing value handling is controlled via nan_policy and may be adapted internally for blocks that do not support all policies.

  • These functions do not perform statistical analysis directly; they orchestrate and compose lower-level preset blocks.

  • The API is intended as a stable, high-level entry point for EDA in Explorica.

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_eda_report
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1]
... })
>>> report = get_eda_report(df)
>>> report.title
'Exploratory Data Analysis Report'
explorica.reports.presets.eda.get_eda_blocks(data: DataFrame | Mapping[Hashable, Sequence], numerical_names: list[Hashable] = None, categorical_names: list[Hashable] = None, target_name: Hashable = None, **kwargs) list[Block][source]

Build a full exploratory data analysis (EDA) report as a list of blocks.

This function orchestrates the creation of multiple EDA-related blocks, including data overview, data quality, and feature interactions. It handles feature assignment, target detection, missing value policy, and automatic block composition.

Parameters:
datapandas.DataFrame or Mapping[Hashable, Sequence]

Input dataset. Can be a DataFrame or a mapping (e.g., dict of lists) convertible to a DataFrame.

numerical_nameslist[Hashable], optional

Explicit names of numerical feature columns. If not provided, numerical features are inferred heuristically.

categorical_nameslist[Hashable], optional

Explicit names of categorical feature columns. If not provided, categorical features are inferred heuristically using categorical_threshold.

target_nameHashable, optional

Name of the target column. If provided, heuristics determine whether it is a numerical or categorical target based on its type and number of unique values.

Returns:
list[Block]

A list of Block objects representing the EDA report. Blocks included:

  • Data overview blocks: cardinality, basic statistics, distributions.

  • Data quality blocks: missing values, outliers, data types.

  • Feature interactions blocks: linear and non-linear interactions (correlations, η², Cramer’s V). Only non-empty interaction blocks are included.

Other Parameters:
target_numerical_nameHashable, optional

Explicit numerical target name. Has priority over target_name.

target_categorical_nameHashable, optional

Explicit categorical target name. Has priority over target_name.

round_digitsint, default=4

Number of decimal places for rounding statistics in tables and plots.

categorical_thresholdint, default=30

Maximum number of unique values for a column to be considered categorical when inferred automatically.

nan_policy{‘drop’, ‘raise’, ‘include’}, default=’drop’

Policy for handling missing values:

  • ‘drop’ : remove rows containing NaNs.

  • ‘raise’: raise an error if missing values are present.

  • ‘include’: keep NaNs where supported; for blocks that do not support ‘include’, this defaults to ‘drop’.

Notes

  • User-specified feature and target assignments take precedence over heuristic inference.

  • If numerical_names or categorical_names are not provided, they will be inferred automatically from the data.

  • If target_name is provided, the function may assign it as both numerical_target and categorical_target based on type and cardinality.

  • nan_policy=’include’ is only supported in blocks that allow missing values; for other blocks, it is automatically converted to ‘drop’.

  • This function is intended as a high-level entry point for casual API users. It does not perform analysis itself, but assembles lower-level blocks.

  • During the construction of EDA or interaction reports, many matplotlib figures may be opened (one per plot or table visualization). This is expected behavior when the dataset contains many features.

  • To prevent runtime warnings about too many open figures, these warnings are ignored internally.

  • To free memory after rendering, it is recommended to explicitly close figures:

    report = get_eda_report(df)
    report.render()
    report.close_figures()
    

    Or for individual blocks:

    block.close_figures()
    

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_eda_blocks
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1]
... })
>>> blocks = get_eda_blocks(df)
>>> len(blocks)
11
>>> blocks[0].block_config.title
'Data Overview'
>>> blocks[5].block_config.title
'Outliers'
>>> blocks[-1].block_config.title  # last block may vary depending on data
'Non-linear relations'
explorica.reports.presets.eda.get_eda_report(data: DataFrame | Mapping[Hashable, Sequence], numerical_names: list[Hashable] = None, categorical_names: list[Hashable] = None, target_name: Hashable = None, **kwargs) Report[source]

Build a full exploratory data analysis (EDA) report.

This function is a thin wrapper around get_eda_blocks(). It constructs a high-level Explorica Report by assembling EDA-related blocks (data overview, data quality, and feature interactions) and assigning a predefined report title.

All feature selection, target inference, missing value handling, and block composition logic is delegated to get_eda_blocks().

Parameters:
datapandas.DataFrame or Mapping[Hashable, Sequence]

Input dataset. Can be a DataFrame or any mapping convertible to a DataFrame (e.g., a dictionary of columns).

numerical_nameslist[Hashable], optional

Explicit names of numerical feature columns. If not provided, numerical features are inferred heuristically.

categorical_nameslist[Hashable], optional

Explicit names of categorical feature columns. If not provided, categorical features are inferred heuristically using categorical_threshold.

target_nameHashable, optional

Name of the target column. If provided, heuristics are used to determine whether it should be treated as a numerical or categorical target.

Returns:
Report

An Explorica Report titled "Exploratory Data Analysis Report", containing zero or more EDA blocks.

Only non-empty blocks are included. If no blocks can be constructed from the provided data and get_eda_reportassignments, the report may be empty.

Other Parameters:
target_numerical_nameHashable, optional

Explicit numerical target name. Has priority over target_name.

target_categorical_nameHashable, optional

Explicit categorical target name. Has priority over target_name.

round_digitsint, default=4

Number of decimal places for rounding numerical statistics in all blocks.

categorical_thresholdint, default=30

Maximum number of unique values for a column to be considered categorical when inferred automatically.

nan_policy{‘drop’, ‘raise’, ‘include’}, default=’drop’

Policy for handling missing values:

  • ‘drop’ : remove rows containing NaNs.

  • ‘raise’: raise an error if missing values are present.

  • ‘include’: keep NaNs where supported; for blocks that do not support ‘include’, this policy is internally converted to ‘drop’.

See also

get_eda_blocks

Constructs the individual EDA blocks used in the report.

Report

Container object used to assemble and render blocks.

Notes

  • This function does not perform analysis itself; it only wraps get_eda_blocks() into a report object.

  • User-specified feature and target assignments always take precedence over heuristic inference.

  • The behavior and contents of the report are entirely determined by get_eda_blocks().

  • During the construction of EDA or interaction reports, many matplotlib figures may be opened (one per plot or table visualization). This is expected behavior when the dataset contains many features.

  • To prevent runtime warnings about too many open figures, these warnings are ignored internally.

  • To free memory after rendering, it is recommended to explicitly close figures:

    report = get_eda_report(df)
    report.render()
    report.close_figures()
    

    Or for individual blocks:

    block.close_figures()
    

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_eda_report
>>> from explorica.reports.core.block import Block
>>> # Simple usage
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1],
... })
>>> # Automatic feature and target inference
>>> report = get_eda_report(df, target_name="y")
>>> len(report.blocks) > 0
True
>>> # Explicit feature assignment
>>> report = get_eda_report(
...     df,
...     numerical_names=["x1", "x2"],
...     categorical_names=["c1"],
...     target_name="y",
... )
>>> report.title
'Exploratory Data Analysis Report'
>>> # Explicit target specification via kwargs
>>> report = get_eda_report(
...     df,
...     numerical_names=["x1", "x2"],
...     categorical_names=["c1"],
...     target_numerical_name="y",
... )
>>> len(report.blocks)
11
>>> all([isinstance(eda_block, Block) for eda_block in report.blocks])
True
>>> report.close_figures()

explorica.reports.presets.interactions

Interactions presets for Explorica reports.

This module provides high-level orchestration utilities for building interaction-focused Explorica reports. It does not implement statistical methods itself; instead, it coordinates feature assignment, heuristic inference, and composition of lower-level analytical blocks.

Functions

get_interactions_blocks(data, feature_assignment=None, category_threshold=30, round_digits=4, nan_policy=”drop”)

Build linear and non-linear interaction blocks for Explorica reports.

get_interactions_report(data, feature_assignment=None, category_threshold=30, round_digits=4, nan_policy=”drop”)

Generate an interaction analysis report.

Notes

  • This module is pandas-based and expects tabular, column-addressable input data (DataFrame or mapping of column names to sequences).

  • User-provided FeatureAssignment objects always take precedence over heuristic feature and target inference.

  • Only non-empty blocks are included in the final report.

  • An empty result indicates insufficient information for interaction analysis rather than an execution error.

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_interactions_report
>>> # Simple usage
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1]
... })
>>> report = get_interactions_report(df)
>>> report.title
'Interaction analysis'
>>> report.close_figures()
explorica.reports.presets.interactions.get_interactions_blocks(data: DataFrame | Mapping[str, Sequence[Any]], numerical_names: list[Hashable] = None, categorical_names: list[Hashable] = None, target_name: Hashable = None, **kwargs) list[Block][source]

Generate linear and non-linear interaction blocks for Explorica reports.

This function orchestrates the creation of two main blocks:

  1. Linear relations block, summarizing correlations and multicollinearity diagnostics.

  2. Non-linear relations block, summarizing eta² (numerical-categorical) and Cramer’s V (categorical-categorical) dependencies.

Parameters:
datapandas.DataFrame or Mapping[str, Sequence[Any]]

Input dataset containing features and optionally target columns.

numerical_nameslist[Hashable], optional

Names of numerical feature columns. If not provided, numerical features are inferred from column dtypes.

categorical_nameslist[Hashable], optional

Names of categorical feature columns. If not provided, categorical features are inferred using cardinality-based heuristics.

target_nameHashable, optional

Name of the target column in data. If provided and explicit target names are not specified, its type and cardinality are used to infer whether it should be treated as numerical, categorical, or both.

Returns:
list[Block]

List of generated Explorica Block instances:

  • The linear relations block is always included.

  • The non-linear relations block is included only if it contains metrics, visualizations, or tables (otherwise it is omitted).

Other Parameters:
target_numerical_nameHashable, optional

Explicit name of the numerical target column. Takes precedence over heuristic inference.

target_categorical_nameHashable, optional

Explicit name of the categorical target column. Takes precedence over heuristic inference.

categorical_thresholdint, default=30

Maximum number of unique values for a column to be considered categorical during heuristic inference.

round_digitsint, default=4

Number of decimal places to round coefficients in tables.

nan_policy{‘drop’, ‘raise’}, default=’drop’

Policy for handling missing values:

  • ‘drop’ : remove rows containing NaNs.

  • ‘raise’: raise an error if missing values are present.

Notes

  • Explicitly provided feature and target names always take precedence over heuristic inference.

  • Features may appear in both numerical and categorical sets if applicable.

  • This function is intended for EDA and interaction analysis purposes.

  • During the construction of EDA or interaction reports, many matplotlib figures may be opened (one per plot or table visualization). This is expected behavior when the dataset contains many features.

  • To prevent runtime warnings about too many open figures, these warnings are ignored internally.

  • To free memory after rendering, it is recommended to explicitly close figures:

    report = get_eda_report(df)
    report.render()
    report.close_figures()
    

    Or for individual blocks:

    block.close_figures()
    

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_interactions_blocks
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1]
... })
>>> blocks = get_interactions_blocks(
...     df,
...     numerical_names=["x1", "x2"],
...     categorical_names=["c1"],
...     target_name="y"
... )
>>> len(blocks)
2
>>> [i.block_config.title for i in blocks]
['Linear relations', 'Non-linear relations']
explorica.reports.presets.interactions.get_interactions_report(data: DataFrame | Mapping[str, Sequence[Any]], numerical_names: list[Hashable] = None, categorical_names: list[Hashable] = None, target_name: Hashable = None, **kwargs) Report[source]

Generate an interaction analysis report.

This function is a high-level orchestrator that constructs an Explorica Report focused on feature interactions. It delegates feature selection, target assignment, and block composition to get_interactions_blocks, and wraps the resulting blocks into a single report.

Parameters:
datapd.DataFrame or Mapping[str, Sequence[Any]]

Input dataset containing features and optionally target columns.

numerical_nameslist[Hashable], optional

Names of numerical feature columns. If not provided, numerical features are inferred from column dtypes.

categorical_nameslist[Hashable], optional

Names of categorical feature columns. If not provided, categorical features are inferred using cardinality-based heuristics.

target_nameHashable, optional

Name of the target column in data. If provided and explicit target names are not specified, its type and cardinality are used to infer whether it should be treated as numerical, categorical, or both.

Returns:
Report

An Explorica Report titled "Interaction analysis" containing zero or more blocks describing linear and non-linear feature interactions.

The report may include:

  • A linear relations block (correlations, multicollinearity diagnostics, and feature–target visualizations).

  • A non-linear relations block (η² and Cramer’s V dependency analysis).

Only non-empty blocks are included in the report. If no interaction blocks can be constructed from the provided data and assignments, the report may be empty.

Other Parameters:
target_numerical_nameHashable, optional

Explicit name of the numerical target column. Takes precedence over heuristic inference.

target_categorical_nameHashable, optional

Explicit name of the categorical target column. Takes precedence over heuristic inference.

categorical_thresholdint, default=30

Maximum number of unique values for a column to be considered categorical during heuristic inference.

round_digitsint, default=4

Number of decimal places to round coefficients in all included blocks.

nan_policy{‘drop’, ‘raise’}, default=’drop’

Policy for handling missing values across all blocks:

  • ‘drop’ : remove rows containing NaNs.

  • ‘raise’: raise an error if missing values are present.

See also

get_interactions_blocks

Constructs the individual interaction blocks used in the report.

Notes

  • This function does not perform any analysis itself; it only orchestrates block construction and report assembly.

  • Explicitly provided feature and target names always take precedence over heuristic inference.

  • The presence and contents of each block depend on the availability of numerical and categorical features and on whether target variables are provided.

  • An empty report indicates insufficient information to compute interaction metrics, not an execution error.

  • During the construction of EDA or interaction reports, many matplotlib figures may be opened (one per plot or table visualization). This is expected behavior when the dataset contains many features.

  • To prevent runtime warnings about too many open figures, these warnings are ignored internally.

  • To free memory after rendering, it is recommended to explicitly close figures:

    report = get_eda_report(df)
    report.render()
    report.close_figures()
    

    Or for individual blocks:

    block.close_figures()
    

Examples

>>> import pandas as pd
>>> from explorica.reports.presets import get_interactions_report
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1],
... })
>>> # Automatic feature and target inference
>>> report = get_interactions_report(df, target_name="y")
>>> len(report.blocks) > 0
True
>>> # Explicit feature assignment
>>> report = get_interactions_report(
...     df,
...     numerical_names=["x1", "x2"],
...     categorical_names=["c1"],
...     target_name="y",
... )
>>> report.title
'Interaction analysis'
>>> # Explicit target specification via kwargs
>>> report = get_interactions_report(
...     df,
...     numerical_names=["x1", "x2"],
...     categorical_names=["c1"],
...     target_numerical_name="y",
... )
>>> report.blocks
[...]
>>> report.close_figures()