edaflow.visualize_heatmap

edaflow.visualize_heatmap(df: DataFrame, heatmap_type: str = 'correlation', columns: str | List[str] | None = None, title: str | None = None, figsize: tuple | None = None, cmap: str = 'RdYlBu_r', annot: bool = True, fmt: str = '.2f', square: bool = True, linewidths: float = 0.5, cbar_kws: dict | None = None, method: str = 'pearson', missing_threshold: float = 5.0, verbose: bool = True) None[source]

Create comprehensive heatmap visualizations for exploratory data analysis.

This function provides multiple types of heatmaps for different EDA purposes: - Correlation heatmaps for numerical relationships - Missing data pattern heatmaps - Numerical data value heatmaps - Cross-tabulation heatmaps for categorical relationships

Parameters:
  • df (pd.DataFrame) – The input DataFrame

  • heatmap_type (str, optional) – Type of heatmap to create. Options: - “correlation”: Correlation matrix heatmap (default) - “missing”: Missing data pattern heatmap - “values”: Raw data values heatmap (for small datasets) - “crosstab”: Cross-tabulation heatmap for categorical data Defaults to “correlation”.

  • columns (Optional[Union[str, List[str]]], optional) – Column name(s) to include. If None, uses appropriate columns based on heatmap_type. Defaults to None.

  • title (Optional[str], optional) – Custom title for the heatmap. If None, auto-generated. Defaults to None.

  • figsize (Optional[tuple], optional) – Figure size (width, height). If None, auto-calculated. Defaults to None.

  • cmap (str, optional) – Colormap for the heatmap. Defaults to “RdYlBu_r”.

  • annot (bool, optional) – Whether to annotate cells with values. Defaults to True.

  • fmt (str, optional) – String formatting code for annotations. Defaults to “.2f”.

  • square (bool, optional) – Whether to make cells square-shaped. Defaults to True.

  • linewidths (float, optional) – Width of lines separating cells. Defaults to 0.5.

  • cbar_kws (Optional[dict], optional) – Keyword arguments for colorbar. Defaults to None.

  • method (str, optional) – Correlation method for correlation heatmaps. Options: “pearson”, “kendall”, “spearman”. Defaults to “pearson”.

  • missing_threshold (float, optional) – Threshold for missing data highlighting (%). Only used for missing data heatmaps. Defaults to 5.0.

  • verbose (bool, optional) – If True, displays detailed information about the heatmap creation process. Defaults to True.

Returns:

Displays the heatmap visualization

Return type:

None

Raises:
  • ValueError – If heatmap_type is not supported or no suitable data found.

  • KeyError – If specified column(s) don’t exist in the DataFrame.

Example

>>> import pandas as pd
>>> import edaflow
>>>
>>> # Create sample data
>>> df = pd.DataFrame({
...     'age': [25, 30, 28, 35, 32, 29, 31, 33],
...     'income': [50000, 55000, 48000, 62000, 51000, 45000, 53000, 49000],
...     'score': [85, 90, 78, 92, 88, 95, 81, 87],
...     'category': ['A', 'B', 'A', 'C', 'B', 'A', 'C', 'B']
... })
>>>
>>> # Correlation heatmap (default)
>>> edaflow.visualize_heatmap(df)
>>>
>>> # Missing data pattern heatmap
>>> edaflow.visualize_heatmap(df, heatmap_type="missing")
>>>
>>> # Custom styling
>>> edaflow.visualize_heatmap(
...     df,
...     heatmap_type="correlation",
...     method="spearman",
...     cmap="viridis",
...     title="Spearman Correlation Analysis"
... )