explorica.visualizations

explorica.visualizations.plots

High-level plotting utilities for Explorica visualizations.

This module provides a set of functions to generate common plots using Matplotlib, Seaborn, and Plotly. It standardizes plot outputs through the VisualizationResult dataclass and supports flexible styling, color palettes, and interactivity.

Methods

barchart(data, category, ascending=None, horizontal=False, **kwargs): Plots a bar chart from categorical and numerical data. Supports vertical or horizontal orientation, automatic sorting, and styling through Seaborn.
piechart(data, category, autopct_method=’value’, **kwargs): Draws a pie chart based on categorical and numerical data. Supports value, percent, or combined display on each segment.
mapbox(lat, lon, category=None, **kwargs): Generates an interactive geographic scatter plot using Plotly Mapbox. Supports categorical coloring, point scaling, hover labels, and Mapbox styling.

Notes

All plotting functions return a explorica.types.VisualizationResult, which provides a consistent interface for accessing the figure, axes (if applicable), plotting engine, and additional metadata.
plot_kws allows passing keyword arguments directly to the underlying plotting function used by the engine (Matplotlib, Seaborn, or Plotly). This provides fine-grained control over styling and behavior specific to that function.

Examples

>>> import matplotlib.pyplot as plt
>>> from explorica.visualizations.plots import barchart
>>> # Basic vertical bar chart (Matplotlib)
>>> data = [3, 7, 5]
>>> categories = ['A', 'B', 'C']
>>> result = barchart(data, categories,
...                       plot_kws={'color':'skyblue', 'edgecolor':'black'})
>>> result.figure.show()

>>> # Pie chart with percentages displayed
>>> from explorica.visualizations.plots import piechart
>>> result = piechart(data, categories, autopct_method='percent')
>>> result.figure.show()
>>> result.extra_info
{'autopct_method': 'percent'}

>>> # Mapbox scatter plot with categorical coloring
>>> from explorica.visualizations.plots import mapbox
>>> lat = [34.05, 40.71, 37.77]
>>> lon = [-118.24, -74.00, -122.42]
>>> categories = ['City1', 'City2', 'City3']
>>> result = mapbox(lat, lon, category=categories)
>>> # Show interactive map with hover labels
>>> result.figure.show()

>>> # Close all mpl figures after usage
>>> plt.close('all')

explorica.visualizations.plots.barchart(data: Sequence[float] | Mapping[Any, Sequence[float]], category: Sequence[Any] | Mapping[Any, Sequence[Any]], ascending: bool = None, horizontal: bool = False, **kwargs) → VisualizationResult[source]

Plot a Bar Chart using categorical and numerical data series.

This function creates a bar chart to visualize the relationship between categorical labels and numerical values. It supports both vertical and horizontal orientations, automatic sorting, and comprehensive styling options through integration with Seaborn’s visualization system.

Under the hood, the function uses Matplotlib’s matplotlib.axes.Axes.bar and matplotlib.axes.Axes.barh functions and applies Seaborn styles for aesthetic defaults. This allows passing additional kwargs directly to the underlying Matplotlib calls via plot_kws. For complete parameter documentation and advanced customization options, see urls below

Parameters:

dataSequence[float] | Mapping[Any, Sequence[float]]

A sequence containing numerical values (bar heights).

categorySequence[Any] | Mapping[Any, Sequence[Any]]

A sequence containing categorical labels (bar names).

ascendingbool, optional

If True or False, sorts the bars by value in ascending or descending order, respectively. If None (default), the original order is preserved.

horizontalbool, optional

If True, plots a horizontal bar chart (barh) instead of a vertical one. Defaults to False.

opacityfloat, default=0.5

Transparency of the bars (alpha value). Must be between 0 and 1.

titlestr, optional

The title of the chart. Defaults to an empty string.

xlabelstr, optional

The label for the X-axis. Overrides the automatic label.

ylabelstr, optional

The label for the Y-axis. Overrides the automatic label.

figsizetuple[float, float], optional

The Matplotlib figure size (width, height) in inches. Defaults to (10, 6).

palettestr or dict, optional

The Seaborn/Matplotlib color palette to use for the plot.

stylestr, optional

The Matplotlib/Seaborn style to apply to the figure (e.g., ‘whitegrid’, ‘darkgrid’).

plot_kwsdict, optional

Dictionary of keyword arguments passed directly to the underlying matplotlib functions (matplotlib.axes.Axes.bar & matplotlib.axes.Axes.barh). This allows overriding any default plotting behavior. If not provided, the function internally constructs a dictionary from its own relevant parameters. Keys provided in plot_kws take precedence over internally generated defaults. For complete parameter documentation and advanced customization options, see urls below.

nan_policy{‘drop’, ‘raise’}, default=’drop’

Policy for handling NaN values in input data:

‘raise’ : raise ValueError if any NaNs are present in data.
‘drop’ : drop rows (axis=0) containing NaNs before computation. This does not drop entire columns.

directorystr, optional

The path to the directory for saving the plot. If None (default), the plot is not saved.

verbosebool, optional

If True, prints messages about the saving process. Defaults to False.

Returns:

VisualizationResult: A dataclass encapsulating the result of a visualization. See also explorica.types.VisualizationResult for full attribute details.

Raises:

ValueError: If the lengths of the ‘data’ and ‘category’ input series do not match. If the ‘data’ or ‘category’ input contains more than one column/dimension. If nan_policy=’raise’ and missing values (NaN/null) are found in the data.

Warns:

UserWarning: Raised if the input data is empty. An empty plot with a warning message will be returned in this case.

Notes

This function uses matplotlib.axes.Axes.bar and matplotlib.axes.Axes.barh under the hood. For complete parameter documentation and advanced customization options, see: matplotlib bar, matplotlib barh.

Examples

>>> import matplotlib.pyplot as plt
>>> from explorica.visualizations.plots import barchart
>>>
>>>
>>> # Simple vertical Bar Chart
>>> values = [25, 40, 15, 60, 35]
>>> labels = ['Apple', 'Banana', 'Cherry', 'Date', 'Elderberry']
>>> plot = barchart(values, labels, title="Fruit sales")
>>> plot.figure.show()

>>> # Horizontal Bar Chart with descending sort:
>>> values_h = [150, 80, 220]
>>> labels_h = ['Group A', 'Group B', 'Group C']
>>> plot = barchart(values_h, labels_h,
...                 horizontal=True,
...                 ascending=False,
...                 palette='viridis')
>>> plot.figure.show()
>>> # Close all mpl figures after usage
>>> plt.close('all')

explorica.visualizations.plots.mapbox(lat: Sequence[float], lon: Sequence[float], category: Sequence | None = None, **kwargs) → VisualizationResult[source]

Display an interactive geographic scatter plot (Mapbox).

This method provides a high-level interface for visualizing spatial data using latitude and longitude coordinates. It supports categorical coloring, dynamic point sizing, custom hover labels, and Plotly Mapbox styling.

Under the hood, the function uses Plotly’s plotly.express.scatter_map function and applies plotly styles for aesthetic defaults. This allows passing additional kwargs directly to the underlying Plotly calls via plot_kws. For complete parameter documentation and advanced customization options, see urls below

Parameters:

latSequence[float]: Latitude values for each point. Cannot contain null values.
lonSequence[float]: Longitude values for each point. Must match the length of lat and cannot contain null values.
categorySequence[Any], optional: Categorical labels used to color the points. Must match the length of lat and lon, cannot contain nulls, and determines the number of discrete colors in the plot.

Returns:

VisualizationResult: A dataclass encapsulating the result of a visualization. See also explorica.types.VisualizationResult for full attribute details.

Other Parameters:

hover_nameSequence[Any], optional: Labels to show on hover. Must match the length of lat and lon.
sizeSequence[float], optional: Numerical values used to scale point sizes. Must match the length of lat and lon.
titlestr, optional: Plot title.
show_legendbool, default=True: Whether to display the legend. Relevant only if category is provided.
paletteSequence[str], optional: List of colors (hex or named) for categories. If not provided, the default Plotly color sequence (px.colors.qualitative.Plotly) is used.
opacityfloat, default=0.7: Marker opacity.
heightint, default=600: Figure height in pixels.
widthint, default=800: Figure width in pixels.
templatestr, default=”plotly_white”: Plotly template used for styling. E.g., “plotly_dark”, “ggplot2”, “seaborn”.
map_stylestr, default=”open-street-map”: Mapbox style used for rendering the map. E.g., “carto-positron”, “carto-darkmatter”, “stamen-terrain”, “open-street-map”.
plot_kwsdict, optional: Dictionary of keyword arguments passed directly to the underlying Plotly function (px.scatter_mapbox). This allows overriding any default plotting behavior. If not provided, the function internally constructs a dictionary from its own relevant parameters. Keys provided in plot_kws take precedence over internally generated defaults. For complete parameter documentation and advanced customization options, see urls below.
nan_policystr, default=”drop”: Policy for handling NaN values in input data. Supports ‘drop’ (removes rows with NaNs) or ‘raise’ (raises an error).
directorystr or Path, optional: Path to save the figure as HTML.
verbosebool, default=False: Enable logging.

Raises:

ValueError: If lat, lon, or any optional input contain nulls or mismatched lengths.

Warns:

UserWarning: Raised if the input data is empty. An empty plot with a warning message will be returned in this case.

Notes

This function uses plotly.express.scatter_map under the hood. For complete parameter documentation and advanced customization options, see: plotly scatter_map.
The plot is saved as an interactive HTML file when directory is set.
Color resolution is handled internally using resolve_plotly_palette.
This function is intended for rapid map-based EDA rather than full cartographic customization.
Supported Plotly templates.
Supported Mapbox styles.

Examples

>>> from pathlib import Path
>>> from explorica.visualizations.plots import barchart
>>>
>>>
>>> # Basic Mapbox scatter plot usage
>>> lat = [34.05, 40.71, 37.77]
>>> lon = [-118.24, -74.00, -122.42]
>>> result = mapbox(lat, lon, title="Major US Cities")
>>> result.figure.show()

>>> # Mapbox scatter plot with categorical coloring usage
>>> lat = [34.05, 40.71, 37.77, 51.50]
>>> lon = [-118.24, -74.00, -122.42, -0.12]
>>> category = ["US", "US", "US", "UK"]
>>> result = mapbox(lat, lon, category=category, title="USA vs UK Cities")
>>> result.figure.show()

>>> # HTML saving example
>>> lat = [34.05, 40.71]
>>> lon = [-118.24, -74.00]
>>> result = mapbox(
...     lat, lon,
...     plot_kws={"zoom": 4},
...     directory="./plots",
...     title="Saved Map"
... )
>>> result.figure.show()
>>> # Check that the file actually created
>>> Path("./plots/Saved Map.html").exists()
True

explorica.visualizations.plots.piechart(data: Sequence[float], category: Sequence[Any], autopct_method: str = 'value', **kwargs) → VisualizationResult[source]

Draw a pie chart based on categorical and corresponding numerical data.

This function generates a pie chart where each segment represents a category from the input data. The size of each segment is proportional to the corresponding numerical value in data. The chart supports automatic display of percentages, raw values, or both on each segment.

Under the hood, the function uses Matplotlib’s matplotlib.axes.Axes.pie function and applies Seaborn styles for aesthetic defaults. This allows passing additional kwargs directly to the underlying Matplotlib calls via plot_kws. For complete parameter documentation and advanced customization options, see urls below

Parameters:

dataSequence[float]: A numerical sequence representing the sizes of the segments.
categorySequence[Any]: A categorical sequence representing the pie chart segments.
autopct_methodstr, default=”value”: Determines how the values are displayed on the pie chart. Supported options: “percent”, “value”, “both”.

Returns:

VisualizationResult: A dataclass encapsulating the result of a visualization. See also explorica.types.VisualizationResult for full attribute details.

Other Parameters:

titlestr, optional: Title of the pie chart.
xlabelstr, optional: The label for the X-axis. Overrides the automatic label.
ylabelstr, optional: The label for the Y-axis. Overrides the automatic label.
show_legendbool, default=True: Whether to display a legend.
show_labelsbool, default=True: Whether to display category labels directly on the chart.
palettestr or list, optional: Color palette to use for the plot.
figsizetuple, default=(10, 6): Figure size (width, height) in inches.
plot_kwsdict, optional: Dictionary of keyword arguments passed directly to the underlying matplotlib function (matplotlib.Axes.ax.pie). This allows overriding any default plotting behavior. If not provided, the function internally constructs a dictionary from its own relevant parameters. Keys provided in plot_kws take precedence over internally generated defaults. For complete parameter documentation and advanced customization options, see urls below.
directorystr, optional: If provided, the plot will be saved to this directory.
nan_policy{“drop”, “raise”}, default=”drop”: How to handle NaN values.
verbosebool, default=False: If True, print additional information.

Raises:

ValueError: If input sizes mismatch. If invalid autopct method is provided.

Warns:

UserWarning: Raised if the input data is empty. An empty plot with a warning message will be returned in this case.

Notes

This function uses matplotlib.axes.Axes.pie under the hood. For complete parameter documentation and advanced customization options, see: matplotlib pie.

Examples

>>> import matplotlib.pyplot as plt
>>> from explorica.visualizations.plots import piechart
>>>
>>>
>>> # Simple pie chart displaying raw values
>>> data = [15, 30, 45, 10]
>>> categories = ["A", "B", "C", "D"]
>>> result = piechart(data, categories, autopct_method="value",
...                       title="Simple Pie")
>>>  # Display the chart
>>> result.figure.show()
>>> result.title
'Simple Pie'

>>> # Pie chart showing percentages on each segment
>>> data = [50, 25, 25]
>>> categories = ["Apples", "Bananas", "Cherries"]
>>> result = piechart(data, categories,
...     autopct_method="percent", show_legend=True)
>>> result.figure.show()
>>> result.extra_info["autopct_method"]
'percent'
>>> # Close all mpl figure after usage
>>> plt.close('all')

explorica.visualizations.scatterplot

High-level plotting utilities for Explorica visualizations.

This module provides a high-level interface for generating scatter plots with optional categorization and trendline fitting. It is designed to offer a consistent and expressive API built on top of Matplotlib, with seamless support for themes, palettes, NaN handling, and plot saving.

Functions

scatterplot(data, target, category=None, **kwargs): Generates a scatter plot with optional categorization and a trendline.

Notes

All plotting functions return a explorica.types.VisualizationResult, which provides a consistent interface for accessing the figure, axes (if applicable), plotting engine, and additional metadata.
plot_kws allows passing keyword arguments directly to the underlying plotting function used by the engine (Matplotlib, Seaborn, or Plotly). This provides fine-grained control over styling and behavior specific to that function.

Examples

>>> import matplotlib.pyplot as plt
>>> from explorica.visualizations.scatterplot import scatterplot
>>>
>>>
>>> data = [1, 2, 3, 4, 5, 6, 7]
>>> target = [2.5, 3.2, 4.8, 5.1, 7.5, 9.0, 10.5]
>>> category = ['A', 'A', 'B', 'B', 'A', 'B', 'A']
>>> # Basic scatterplot with linear trendline
>>> plot = scatterplot(data, target, trendline='linear', show_legend=False,
...                       title='Basic Scatterplot')
>>> plot.figure.show()

>>> # Scatterplot with categories and saving
>>> plot = scatterplot(
...     data, target, category=category, show_legend=True,
...     title='Scatterplot with Categories', directory='plots',
...     figsize=(8, 5))
>>> # The figure is saved to the 'plots' directory with filename 'scatterplot.png'

>>> # Scatterplot with polynomial trendline and custom plot_kws
>>> plot = scatterplot(data, target, trendline='polynomial',
...                       plot_kws={'s': 100, 'marker': 'o', 'c': 'orange',
...                                 'edgecolor': 'black'},
...                       title='Scatterplot with Polynomial Trendline')
>>> plot.figure.show()
>>> plt.close(plot.figure)

explorica.visualizations.scatterplot.scatterplot(data: Sequence[Number], target: Sequence[Number], category: Sequence[Any] = None, **kwargs) → VisualizationResult[source]

Generate a scatter plot with optional categorization and a trendline.

The function supports coloring points by category and displaying a fitted trendline. The trendline can be automatically calculated using Ordinary Least Squares (OLS) for linear or polynomial regression, or a custom pre-calculated user function can be provided.

Under the hood, the function uses Matplotlib’s matplotlib.axes.Axes.scatter function and applies Seaborn styles for aesthetic defaults. This allows passing additional kwargs directly to the underlying Matplotlib calls via plot_kws. For complete parameter documentation and advanced customization options, see urls below.

Parameters:

dataSequence[Number]: Data for the X-axis.
targetSequence[Number]: Data for the Y-axis.
categorySequence[Any], optional: Categorical data used for coloring points and generating a legend. Defaults to None (no categorization).

Returns:

VisualizationResult: A dataclass encapsulating the result of a visualization. See also explorica.types.VisualizationResult for full attribute details.

Other Parameters:

titlestr, optional

Plot title.

xlabelstr, optional

X-axis label.

ylabelstr, optional

Y-axis label.

title_legendstr, default=”Category”

Title for the category legend.

show_legendbool, default=False

If True, displays the category legend (if categories exist).

opacityfloat, optional

Transparency level for scatter points (0.0 to 1.0).

palettestr or list or None, optional

Color palette name (e.g., ‘viridis’) or a list of specific colors.

cmapstr or None, optional

Colormap name for continuous data (rarely used in scatter plots but included for theme compatibility).

stylestr or None, optional

Matplotlib style context (e.g., ‘seaborn-v0_8’).

figsizetuple[int, int], default=(10, 6)

Figure size (width, height) in inches.

trendlinestr or Callable, optional

Method to draw a trendline. Supports ‘linear’, ‘polynomial’, or a custom callable function. If a string (‘linear’ or ‘polynomial’) is provided, the trendline is automatically fitted using Ordinary Least Squares (OLS) under the hood. If a callable function is provided, it must implement a mapping from a single numeric input to a single numeric output (y = f(x)); the function itself is not modified and will be automatically vectorized over the X-domain for plotting.

trendline_kwsdict, optional

Additional arguments for the trendline function. Keys include:

‘color’str, optional
Color of the trendline.
‘linestyle’str, default=’–’
Linestyle of the trendline (e.g., ‘-’, ‘–’, ‘:’, ‘-.’).
‘linewidth’int, default=2
Thickness of the trendline.
‘x_range’tuple[float, float], optional
The domain (min, max) for which the trendline should be calculated and plotted. If None, it uses the min and max of the input data (x).
‘degree’int, default=2
The degree of the polynomial to fit. Only used when trendline=’polynomial’.
‘dots’int, default=1000
The number of data points used to draw the smooth trendline curve.

plot_kwsdict, optional

Dictionary of keyword arguments passed directly to the underlying seaborn function (matplotlib.axes.Axes.scatter). This allows overriding any default plotting behavior. If not provided, the function internally constructs a dictionary from its own relevant parameters. Keys provided in plot_kws take precedence over internally generated defaults. For complete parameter documentation and advanced customization options, see urls below.

nan_policystr, default=”drop”

Policy for handling NaN values in input data. Supports ‘drop’ (removes rows with NaNs) or ‘raise’ (raises an error).

directorystr or None, optional

Directory path to save the plot. If None, the plot is not saved.

verbosebool, default=False

If True, prints save messages.

Raises:

ValueError

If trendline is a string and is not one of the supported methods (‘linear’, ‘polynomial’).
If trendline is a callable and its output is not a 1D sequence of numbers with the same length as x_domain.
If data, target, or category (if provided) are not 1D sequences or their lengths do not match.
If any of data, target, or category contains NaNs and nan_policy=’raise’.
If trendline_kws[‘degree’] or trendline_kws[‘dots’] are not natural numbers.

Warns:

UserWarning: if the input data is empty. An empty plot with a warning message will be returned in this case. If the number of unique objects to visualize (categories + trendline) exceeds the number of available colors in the chosen palette.

Notes

This function uses matplotlib.axes.Axes.scatter under the hood. For complete parameter documentation and advanced customization options, see: matplotlib scatter

Examples

>>> import matplotlib.pyplot as plt
>>> from explorica.visualizations.scatterplot import scatterplot
>>>
>>>
>>> data = [1, 2, 3, 4, 5, 6, 7]
>>> target = [2.5, 3.2, 4.8, 5.1, 7.5, 9.0, 10.5]
>>> category = ['A', 'A', 'B', 'B', 'A', 'B', 'A']
>>>
>>> # Basic scatterplot with linear trendline
>>> plot = scatterplot(
...     data,
...     target,
...     trendline='linear',
...     show_legend=False,
...     title='Basic Scatterplot'
... )
>>> plot.figure.show()

>>> # Scatterplot with categories and saving the plot
>>> plot = scatterplot(
...     data,
...     target,
...     category=category,
...     show_legend=True,
...     title='Scatterplot with Categories',
...     directory='plots',
...     figsize=(8, 5)
... )
>>> # The figure is saved to the 'plots' directory with filename 'scatterplot.png'

>>> # Passing additional Matplotlib options via plot_kws and a polynomial trendline
>>> plot = scatterplot(
...     data,
...     target,
...     trendline="polynomial",
...     plot_kws={'s': 100, 'marker': 'o', 'c': 'orange', 'edgecolor': 'black'},
...     title='Scatterplot with Polynomial Trendline'
... )
>>> plot.figure.show()
>>> plt.close(plot.figure)

explorica.visualizations.statistical_plots

High-level plotting utilities for Explorica visualizations.

This module defines high-level functions for exploring distributions and relationships between numeric variables. Each function is independent and returns a VisualizationResult dataclass encapsulating the figure, axes, engine, and metadata.

Functions

distplot(data, bins = 30, kde = True, **kwargs): Plots a histogram of numeric data with optional Kernel Density Estimation (KDE).
boxplot(data, **kwargs): Draws a boxplot for a numeric variable to visualize distribution, median, and potential outliers.
hexbin(data, target, **kwargs): Creates a hexbin plot for two numeric variables. Useful for visualizing dense scatter data.
heatmap(data, **kwargs): Generates a heatmap to visualize a 2D array of numeric values.

Notes

All plotting functions return a explorica.types.VisualizationResult, which provides a consistent interface for accessing the figure, axes (if applicable), plotting engine, and additional metadata.
plot_kws allows passing keyword arguments directly to the underlying plotting function used by the engine (Matplotlib, Seaborn, or Plotly). This provides fine-grained control over styling and behavior specific to that function.

Examples

>>> from pathlib import Path
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> from explorica.visualizations.statistical_plots import distplot
>>>
>>>
>>> # Distribution plot with custom bins, KDE, and figure size
>>> data = np.random.normal(loc=0, scale=1, size=100)
>>> result = distplot(
...     data,
...     bins=30,
...     kde=True,
...     title="Normal Distribution Example",
...     figsize = (8, 5),
... )
>>> result.figure.show()

>>> # Boxplot with figure saving
>>> from explorica.visualizations.statistical_plots import boxplot
>>> result = boxplot(
...     data,
...     title="Boxplot Example",
...     directory="./plots",
...     figsize = (6, 4)
... )
>>> Path("./plots/boxplot.png").exists()
True

>>> # Hexbin plot with color map and point sizing via plot_kws
>>> from explorica.visualizations.statistical_plots import hexbin
>>> x = np.random.randn(500)
>>> y = x*0.5 + np.random.randn(500)*0.5
>>> result = hexbin(
...     x, y,
...     gridsize=25,
...     colormap="plasma",
...     title="Hexbin Example",
...     figsize = (7, 6),
...     plot_kws={"mincnt": 1}
... )
>>> result.figure.show()

>>> # Heatmap with annotations, custom colormap, and figure size
>>> from explorica.visualizations.statistical_plots import heatmap
>>> matrix = np.random.rand(5, 5)
>>> result = heatmap(
...     matrix,
...     annot=True,
...     cmap="coolwarm",
...     title="Annotated Heatmap",
...     figsize=(6, 5)
... )
>>> result.figure.show()
>>> plt.close(result.figure)

explorica.visualizations.statistical_plots.boxplot(data: Sequence[float] | Mapping[str, Sequence[float]], **kwargs) → VisualizationResult[source]

Draw a boxplot for a numeric variable.

This function generates a standard boxplot to visualize the distribution, median, quartiles, and potential outliers of numeric data.

Under the hood, the function uses Matplotlib’s matplotlib.axes.Axes.boxplot function and applies Seaborn styles for aesthetic defaults. This allows passing additional kwargs directly to the underlying Matplotlib calls via plot_kws. For complete parameter documentation and advanced customization options, see urls below.

Parameters:

dataSequence[float] | Mapping[str, Sequence[float]]

Numeric input data. Can be:

1D sequence of numbers
Dictionary with single key-value pair (value is numeric sequence)
pandas Series or single-column DataFrame

Returns:

VisualizationResult: A dataclass encapsulating the result of a visualization. See also explorica.types.VisualizationResult for full attribute details.

Other Parameters:

titlestr, optional: Title of the chart.
xlabelstr, optional: Label for the X-axis.
ylabelstr, optional: Label for the Y-axis.
palettestr or list, optional: Color palette to use for the plot.
figsizetuple, default=(10, 6): Figure size (width, height) in inches.
plot_kwsdict, optional: Dictionary of keyword arguments passed directly to the underlying matplotlib function (matplotlib.axes.Axes.boxplot). This allows overriding any default plotting behavior. If not provided, the function internally constructs a dictionary from its own relevant parameters. Keys provided in plot_kws take precedence over internally generated defaults. For complete parameter documentation and advanced customization options, see urls below.
directorystr, optional: If provided, the plot will be saved to this directory.
nan_policy{“drop”, “raise”}, default=”drop”: How to handle NaN values.
verbosebool, default=False: If True, enables informational logging during plot generation.

Warns:

UserWarning: Raised if the input data is empty. An empty plot with a warning message will be returned in this case.

Notes

This function uses matplotlib.axes.Axes.boxplot under the hood. For complete parameter documentation and advanced customization options, see: matplotlib boxplot.

Examples

>>> from pathlib import Path
>>> import matplotlib.pyplot as plt
>>> from explorica.visualizations.statistical_plots import boxplot
>>> # Basic boxplot
>>> plot = boxplot([5, 7, 8, 5, 6, 9, 12], title="Simple Boxplot")
>>> plot.figure.show()

>>> # Saving the plot to disk
>>> plot = boxplot(
...     [4, 6, 5, 7, 8],
...     directory="./plots",
...     title="Saved Boxplot"
... )
>>> Path("./plots/boxplot.png").exists()
True

>>> # Detecting potential outliers
>>> data_with_outliers = [10, 12, 11, 14, 100, 13, 12, 9, 105]
>>> plot = boxplot(data_with_outliers, title="Boxplot with Outliers")
>>> plot.figure.show()
>>> plt.close(plot.figure)

explorica.visualizations.statistical_plots.distplot(data: Sequence[float] | Mapping[str, Sequence[float]], bins: int = 30, kde: bool = True, **kwargs) → VisualizationResult[source]

Plot a histogram with optional kernel density estimate.

This function generates a histogram for univariate data with an optional kernel density estimate (KDE) curve overlay. It automatically handles data conversion, NaN values, and provides flexible styling options through integration with Seaborn’s plotting system.

Under the hood, the function uses Seaborn’s seaborn.histplot function and applies Seaborn styles for aesthetic defaults. This allows passing additional kwargs directly to the underlying Seaborn calls via plot_kws. For complete parameter documentation and advanced customization options, see urls below.

Parameters:

dataSequence[float] | Mapping[str, Sequence[float]]

Numeric input data. Can be:

1D sequence of numbers
Dictionary with single key-value pair (value is numeric sequence)
pandas Series or single-column DataFrame

Must be one-dimensional.

binsint, default=30

Number of bins in the histogram. Must be a positive integer.

kdebool, default=True

If True, adds kernel density estimate curve.

opacityfloat, default=0.5

Transparency of the histogram bars (alpha value). Must be between 0 and 1.

titlestr, optional

Plot title.

xlabelstr, optional

Label for the X-axis.

ylabelstr, optional

Label for the Y-axis.

figsizetuple[float, float], default=(10, 6)

Figure size (width, height) in inches.

stylestr, optional

Seaborn style context (e.g., “whitegrid”, “darkgrid”).

palettestr, optional

Seaborn color palette (e.g., “viridis”, “husl”).

plot_kwsdict, optional

Dictionary of keyword arguments passed directly to the underlying seaborn function (sns.histplot). This allows overriding any default plotting behavior. If not provided, the function internally constructs a dictionary from its own relevant parameters. Keys provided in plot_kws take precedence over internally generated defaults. For complete parameter documentation and advanced customization options, see urls below.

directorystr, optional

File path to save figure (e.g., “./plot.png”).

nan_policystr | Literal[‘drop’, ‘raise’], default=’drop’

Policy for handling NaN values in input data:

‘raise’ : raise ValueError if any NaNs are present in data.
‘drop’ : drop rows (axis=0) containing NaNs before computation. This does not drop entire columns.

verbosebool, default=False

If True, enables informational logging during plot generation.

Returns:

VisualizationResult: A dataclass encapsulating the result of a visualization. See also explorica.types.VisualizationResult for full attribute details.

Raises:

ValueError: If input data contains multiple columns (not 1-dimensional). If bins is not a positive integer.

Warns:

UserWarning: Raised if the input data is empty. An empty plot with a warning message will be returned in this case.

Notes

This function uses seaborn.histplot under the hood. For complete parameter documentation and advanced customization options, see: seaborn histplot.
Vectorization support is planned to be added.
Empty data returns a placeholder plot with informative message.

Examples

>>> from pathlib import Path
>>> import matplotlib.pyplot as plt
>>> from explorica.visualizations.statistical_plots import distplot
>>> # Simple distribution plot with KDE
>>> data = [1, 2, 2, 3, 3, 3, 4]
>>> result = distplot(data, kde=True, title="Small Dataset Example")
>>> result.figure.show()

>>> # Distribution plot with figure saving
>>> data = [10, 20, 20, 30, 40, 40, 50]
>>> result = distplot(
...     data,
...     bins=5,
...     title="Saved Plot Example",
...     directory="./plots"
... )
>>> Path("./plots/distplot.png").exists()
True

>>> # Normal distribution with custom bins and figure size
>>> import numpy as np
>>> data = np.random.normal(loc=50, scale=15, size=200)
>>> result = distplot(
...     data,
...     bins=25,
...     kde=True,
...     title="Normal Distribution Example",
...     figsize=(8, 5),
...     plot_kws={"color": "skyblue"}
... )
>>> result.figure.show()
>>> plt.close(result.figure)

explorica.visualizations.statistical_plots.heatmap(data: Sequence[float] | Sequence[Sequence[float]] | Mapping[Any, Sequence[float]], **kwargs) → VisualizationResult[source]

Draw a heatmap from the provided data using Matplotlib and Seaborn.

Create a heatmap visualization from numerical data, supporting 1D, 2D sequences, or mappings of keys to sequences. The function automatically handles NaN values according to nan_policy, and provides options for figure size, annotations, and saving the plot to a directory.

Under the hood, the function uses Seaborn’s seaborn.heatmap function and applies Seaborn styles for aesthetic defaults. This allows passing additional kwargs directly to the underlying Seaborn calls via plot_kws. For complete parameter documentation and advanced customization options, see urls below.

Parameters:

datasequence of floats, sequence of sequences of floats, or mapping

Input data for the heatmap. Can be:

1D sequence of numerical values (converted to a 1-row heatmap),
2D sequence (list of lists, NumPy array, etc.),
Mapping of keys to sequences (converted to a DataFrame).

Returns:

VisualizationResult: A dataclass encapsulating the result of a visualization. See also explorica.types.VisualizationResult for full attribute details.

Other Parameters:

annotbool, default=True: Whether to annotate the heatmap cells with their numeric values.
cmapstr, optional: Color map for the heatmap.
titlestr, optional: Plot title.
xlabelstr, optional: X-axis label.
ylabelstr, optional: Y-axis label.
figsizetuple[float, float], default=(10, 6): Figure size.
plot_kwsdict, optional: Dictionary of keyword arguments passed directly to the underlying seaborn function (seaborn.heatmap). This allows overriding any default plotting behavior. If not provided, the function internally constructs a dictionary from its own relevant parameters. Keys provided in plot_kws take precedence over internally generated defaults. For complete parameter documentation and advanced customization options, see urls below.
directorystr or None, optional: Directory path to save the figure. If None, the figure is not saved.
nan_policystr, default=”drop”: Policy for handling NaN values in input data. Supports ‘drop’ (removes rows with NaNs) or ‘raise’ (raises an error).
verbosebool, default=False: Whether to print additional messages during plotting.

Raises:

ValueError: If NaN values are present and nan_policy=”raise”.

Warns:

UserWarning: Raised if the input data is empty. An empty plot with a warning message will be returned in this case.

Notes

This function uses seaborn.heatmap under the hood. For complete parameter documentation and advanced customization options, see seaborn heatmap.

Examples

>>> from pathlib import Path
>>> import matplotlib.pyplot as plt
>>> from explorica.visualizations.statistical_plots import heatmap
>>> # Simple usage
>>> plot = heatmap(
...     [[1, 2, 3, 4, 5],
...     [5, 4, 3, 2, 1],
...     [2, 4, 6, 8, 10]],
...     xlabel="X Variable",
...     ylabel="Y Variable",
...     cmap="viridis"
... )
>>> plot.figure.show()

>>> # Saving the plot to a directory
>>> plot = heatmap(
...     [[1, 2, 3, 4, 5],
...     [5, 4, 3, 2, 1],
...     [2, 4, 6, 8, 10]],
...     title="Heatmap Example",
...     directory="plots",
...     figsize=(8, 5)
... )
>>> # The figure is saved to the 'plots' directory with filename 'heatmap.png'

>>> # Passing additional Matplotlib options via plot_kws
>>> plot = heatmap(
...     [[1, 2, 3, 4, 5],
...     [5, 4, 3, 2, 1],
...     [2, 4, 6, 8, 10]],
...     plot_kws={"cmap": "viridis", "cbar": True},
...     annot=False
... )
>>> plot.figure.show()
>>> plt.close(plot.figure)

explorica.visualizations.statistical_plots.hexbin(data: Sequence[Number], target: Sequence[Number], **kwargs) → VisualizationResult[source]

Create a hexbin plot for two numeric variables to visualize point density.

A hexbin plot is a bivariate histogram that uses hexagonal bins to display the density of points in a 2D space. It is particularly useful for large datasets where scatter plots become overcrowded.

Under the hood, the function uses Matplotlib’s matplotlib.axes.Axes.hexbin function and applies Seaborn styles for aesthetic defaults. This allows passing additional kwargs directly to the underlying Matplotlib calls via plot_kws. For complete parameter documentation and advanced customization options, see urls below.

Parameters:

dataSequence[Number]: First numeric variable (plotted on x-axis).
targetSequence[Number]: Second numeric variable (plotted on y-axis).

Returns:

VisualizationResult: A dataclass encapsulating the result of a visualization. See also explorica.types.VisualizationResult for full attribute details.

Other Parameters:

cmapstr or matplotlib.colors.Colormap, optional: Colormap used to color hexagons by count density. Defaults to None.
opacityfloat, default=1: Opacity of the hexagons (0 = fully transparent, 1 = fully opaque).
gridsizeint, default=30: The number of hexagons in the x-direction. Larger values produce smaller hexagons and higher resolution.
titlestr, optional: Title of the chart.
xlabelstr, optional: Label for the x-axis.
ylabelstr, optional: Label for the y-axis.
stylestr, default=None: Seaborn style context (e.g., “whitegrid”, “darkgrid”).
figsizetuple, default=(10, 6): Figure size (width, height) in inches.
plot_kwsdict, optional: Dictionary of keyword arguments passed directly to the underlying matplotlib function (matplotlib.axes.Axes.hexbin). This allows overriding any default plotting behavior. If not provided, the function internally constructs a dictionary from its own relevant parameters. Keys provided in plot_kws take precedence over internally generated defaults. For complete parameter documentation and advanced customization options, see urls below.
directorystr, optional: If provided, the plot will be saved to this directory.
nan_policy{“drop”, “raise”}, default=”drop”: How to handle NaN values in the data.
verbosebool, default=False: If True, enables informational logging during plot generation.

Raises:

ValueError: Raised in the following cases: If data and target have different lengths. If NaN values are present and nan_policy=”raise”. If gridsize is not a positive integer.

Warns:

UserWarning: Raised if the input data is empty. An empty plot with a warning message will be returned in this case.

Notes

This function uses matplotlib.axes.Axes.hexbin under the hood.
For complete parameter documentation and advanced customization options, see: matplotlib hexbin.
Hexbin plots are most effective with large datasets (>1000 points)
The color intensity represents the count of points in each hexagon
For very sparse data, consider using a scatter plot instead

Examples

>>> from pathlib import Path
>>> import matplotlib.pyplot as plt
>>> from explorica.visualizations.statistical_plots import hexbin
>>>
>>>
>>> # Simple usage
>>> plot = hexbin(
...     [1, 2, 3, 4, 5],
...     [2, 4, 6, 8, 10],
...     xlabel="X Variable",
...     ylabel="Y Variable",
...     gridsize=20
... )
>>> plot.figure.show()

>>> # Saving the plot to a directory
>>> plot = hexbin(
...     [1, 2, 3, 4, 5],
...     [2, 4, 6, 8, 10],
...     title="Hexbin Example",
...     directory="plots",
...     figsize=(8, 5)
... )
>>> # The figure is saved to the 'plots' directory with filename 'hexbin.png'

>>> # Passing additional Matplotlib options via plot_kws
>>> plot = hexbin(
...     [1, 2, 3, 4, 5],
...     [2, 4, 6, 8, 10],
...     plot_kws={"cmap": "viridis", "mincnt": 1},
...     gridsize=25
... )
>>> plot.figure.show()
>>> plt.close(plot.figure)