edaflow.visualize_numerical_boxplots
- edaflow.visualize_numerical_boxplots(df: DataFrame, columns: List[str] | None = None, figsize: tuple | None = None, rows: int | None = None, cols: int | None = None, title: str = 'Boxplots for Numerical Columns', show_skewness: bool = True, orientation: str = 'horizontal', color_palette: str = 'Set2') None[source]
Create boxplots for numerical columns to visualize distributions and outliers.
This function automatically detects numerical columns and creates a grid of boxplots to help identify outliers, skewness, and distribution characteristics. Each boxplot can optionally display the skewness value in the title.
- Parameters:
df (pd.DataFrame) – The input DataFrame to analyze
columns (Optional[List[str]], optional) – Specific columns to plot. If None, all numerical columns are used. Defaults to None.
figsize (Optional[tuple], optional) – Figure size (width, height). If None, automatically calculated based on subplot grid. Defaults to None.
rows (Optional[int], optional) – Number of rows in subplot grid. If None, automatically calculated. Defaults to None.
cols (Optional[int], optional) – Number of columns in subplot grid. If None, automatically calculated. Defaults to None.
title (str, optional) – Main title for the entire plot. Defaults to “Boxplots for Numerical Columns”.
show_skewness (bool, optional) – Whether to show skewness values in subplot titles. Defaults to True.
orientation (str, optional) – Boxplot orientation. Either ‘horizontal’ or ‘vertical’. Defaults to ‘horizontal’.
color_palette (str, optional) – Seaborn color palette to use. Defaults to ‘Set2’.
- Returns:
Displays the boxplot visualization
- Return type:
None
- Raises:
ValueError – If orientation is not ‘horizontal’ or ‘vertical’
ValueError – If no numerical columns are found
Example
>>> import pandas as pd >>> import edaflow >>> df = pd.DataFrame({ ... 'age': [25, 30, 35, 40, 100, 28, 32], # 100 is outlier ... 'salary': [50000, 60000, 75000, 80000, 200000, 55000, 65000], # 200000 is outlier ... 'experience': [2, 5, 8, 12, 25, 3, 6], ... 'category': ['A', 'B', 'A', 'C', 'B', 'A', 'C'] ... }) >>> >>> # Basic boxplot visualization >>> edaflow.visualize_numerical_boxplots(df) >>> >>> # Custom layout and styling >>> edaflow.visualize_numerical_boxplots(df, ... rows=2, cols=2, ... title="Custom Boxplots", ... orientation='vertical', ... color_palette='viridis') >>> >>> # Specific columns only >>> edaflow.visualize_numerical_boxplots(df, columns=['age', 'salary']) >>> >>> # Alternative import style: >>> from edaflow.analysis import visualize_numerical_boxplots >>> visualize_numerical_boxplots(df, show_skewness=False)
Notes
Automatically identifies numerical columns (int64, float64, etc.)
Skips columns with all missing values
Outliers are clearly visible as points beyond the whiskers
Skewness interpretation: * |skewness| < 0.5: Approximately symmetric * 0.5 ≤ |skewness| < 1: Moderately skewed * |skewness| ≥ 1: Highly skewed
Uses seaborn styling for better visual appearance