explorica.reports

explorica.reports.utils

Low-level utilities for Explorica’s reports module

Low-level utility functions for standardizing visualization objects into the VisualizationResult format. This module provides a single public function that can be used in user code to normalize Matplotlib and Plotly figures for downstream report generation or further processing.

Functions

normalize_visualization(figure): Convert a Matplotlib or Plotly figure into a standardized VisualizationResult dataclass with extracted metadata.
normalize_table(data): Normalize tabular data into a standardized TableResult object.

normalize_assignment(data, numerical_names = None, categorical_names = None, target_name = None, **kwargs)

Normalize feature and target assignment into a FeatureAssignment object.

Examples

>>> from explorica.reports.utils import normalize_visualization
>>> # Usage with matplotlib figure
>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots()
>>> result = normalize_visualization(fig)
>>> result.engine
'matplotlib'
>>> result.width, result.height
(np.float64(6.4), np.float64(4.8))

>>> # Usage with plotly figure
>>> import plotly.graph_objects as go
>>> fig = go.Figure(data=go.Bar(y=[2, 3, 1]))
>>> result = normalize_visualization(fig)
>>> result.engine
'plotly'
>>> result.width, result.height
(None, None)

explorica.reports.utils.normalize_assignment(data: DataFrame, numerical_names: list[Hashable] = None, categorical_names: list[Hashable] = None, target_name: str = None, **kwargs) → FeatureAssignment[source]

Normalize feature and target assignment into a FeatureAssignment object.

This function converts casual, user-friendly input (feature name lists and an optional target column name) into a fully-populated FeatureAssignment instance suitable for Explorica report and block presets.

If feature names are not explicitly provided, they are inferred from the input DataFrame using heuristic rules.

Parameters:

datapandas.DataFrame: Input dataset containing feature and optional target columns.
numerical_nameslist[Hashable], optional: Names of numerical feature columns. If not provided, numerical features are inferred from column dtypes.
categorical_nameslist[Hashable], optional: Names of categorical feature columns. If not provided, categorical features are inferred using cardinality-based heuristics.
target_nameHashable, optional: Name of the target column in data. If provided and explicit target names are not specified, its type and cardinality are used to infer whether it should be treated as numerical, categorical, or both.

Returns:

FeatureAssignment

A populated FeatureAssignment instance containing:

numerical_features
categorical_features
optional numerical_target
optional categorical_target

Other Parameters:

target_numerical_nameHashable, optional: Explicit name of the numerical target column. Takes priority over heuristic inference.
target_categorical_nameHashable, optional: Explicit name of the categorical target column. Takes priority over heuristic inference.
categorical_thresholdint, default=30: Maximum number of unique values for a column to be considered categorical during heuristic inference.

Notes

Explicitly provided feature and target names always take precedence over heuristic inference.
If feature names are not provided, numerical features are inferred from numeric dtypes, and categorical features are inferred using the same categorical detection logic.
A single target column may be assigned as both numerical and categorical if it satisfies the criteria for both types.
Absence of a target assignment indicates insufficient information, not an error.

Examples

>>> import pandas as pd
>>> from explorica.reports.utils import normalize_assignment
>>> df = pd.DataFrame({
...     "x1": [1, 2, 3, 4],
...     "x2": [10, 20, 30, 40],
...     "c1": ["a", "b", "a", "b"],
...     "y": [0, 1, 0, 1]
... })

>>> # Feature-only assignment
>>> fa = normalize_assignment(
...     data=df,
...     numerical_names=["x1", "x2"],
...     categorical_names=["c1"]
... )
>>> fa.numerical_features
['x1', 'x2']
>>> fa.categorical_features
['c1']
>>> fa.numerical_target is None
True
>>> fa.categorical_target is None
True

>>> # Target inference using heuristics
>>> fa = normalize_assignment(
...     data=df,
...     numerical_names=["x1", "x2"],
...     categorical_names=["c1"],
...     target_name="y",
...     categorical_threshold=3
... )
>>> fa.numerical_target
'y'
>>> fa.categorical_target
'y'

explorica.reports.utils.normalize_table(data: Sequence[float] | Sequence[Sequence[float]] | Mapping[str, Sequence[Any]] | TableResult) → TableResult[source]

Normalize tabular data into a standardized TableResult object.

This function converts input data of various formats (1D/2D sequences, mappings or TableResult) into a TableResult instance containing a Pandas DataFrame. This ensures consistent handling of tabular results across Explorica reports.

Parameters:

dataSequence, Mapping or TableResult

Tabular data to normalize. Supported types include:

1D or 2D sequences (e.g., list, tuple of lists)
Mapping[str, Sequence] (e.g., dict of column_name -> values)

MultiIndex rows or columns are not supported.

Returns:

TableResult: A standardized container wrapping a Pandas DataFrame.

Raises:

ValueError: If the input DataFrame has a MultiIndex in rows or columns.

Examples

>>> from explorica.types import TableResult
>>> from explorica.reports.utils import normalize_table
>>> # Simple usage
>>> data = {"col1": [1, 2, 3], "col2": [4, 5, 6]}
>>> table_result = normalize_table(data)
>>> isinstance(table_result, TableResult)
True
>>> table_result.table.shape
(3, 2)

explorica.reports.utils.normalize_visualization(figure: Figure | Figure | VisualizationResult) → VisualizationResult[source]

Normalize a visualization object into a standardized VisualizationResult.

This function converts a Matplotlib or Plotly figure into a VisualizationResult dataclass, extracting common metadata such as engine, axes, width, height, and title. This allows downstream rendering or report composition functions to work with a uniform interface.

Parameters:

figurematplotlib.figure.Figure or plotly.graph_objects.Figure

The input figure to normalize. Can be:

A Matplotlib figure
A Plotly figure
A pre-normalized VisualizationResult (in which case it is returned as-is)

Returns:

VisualizationResult

A dataclass containing:

figure : The original figure object.
engine : str, either “matplotlib” or “plotly”.
axes : List of Matplotlib axes (for Matplotlib) or None (for Plotly).
width : Figure width in inches (Matplotlib) or pixels (Plotly).
height : Figure height in inches (Matplotlib) or pixels (Plotly).
title : Optional figure title.

Raises:

TypeError: If figure is not an instance of Matplotlib, Plotly Figure, or VisualizationResult.

Notes

For Matplotlib figures, width and height are measured in inches.
For Plotly figures, width and height are measured in pixels.
The original figure is preserved in the figure attribute of the returned VisualizationResult.

Examples

>>> from explorica.reports.utils import normalize_visualization

>>> # Usage with matplotlib figure
>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots()
>>> result = normalize_visualization(fig)
>>> result.engine
'matplotlib'
>>> result.width, result.height
(np.float64(6.4), np.float64(4.8))

>>> # Usage with plotly figure
>>> import plotly.graph_objects as go
>>> fig = go.Figure(data=go.Bar(y=[2, 3, 1]))
>>> result = normalize_visualization(fig)
>>> result.engine
'plotly'
>>> result.width, result.height
(None, None)