edaflow.visualize_numerical_boxplots

edaflow.visualize_numerical_boxplots(df: DataFrame, columns: List[str] | None = None, figsize: tuple | None = None, rows: int | None = None, cols: int | None = None, title: str = 'Boxplots for Numerical Columns', show_skewness: bool = True, orientation: str = 'horizontal', color_palette: str = 'Set2') None[source]

Create boxplots for numerical columns to visualize distributions and outliers.

This function automatically detects numerical columns and creates a grid of boxplots to help identify outliers, skewness, and distribution characteristics. Each boxplot can optionally display the skewness value in the title.

Parameters:
  • df (pd.DataFrame) – The input DataFrame to analyze

  • columns (Optional[List[str]], optional) – Specific columns to plot. If None, all numerical columns are used. Defaults to None.

  • figsize (Optional[tuple], optional) – Figure size (width, height). If None, automatically calculated based on subplot grid. Defaults to None.

  • rows (Optional[int], optional) – Number of rows in subplot grid. If None, automatically calculated. Defaults to None.

  • cols (Optional[int], optional) – Number of columns in subplot grid. If None, automatically calculated. Defaults to None.

  • title (str, optional) – Main title for the entire plot. Defaults to “Boxplots for Numerical Columns”.

  • show_skewness (bool, optional) – Whether to show skewness values in subplot titles. Defaults to True.

  • orientation (str, optional) – Boxplot orientation. Either ‘horizontal’ or ‘vertical’. Defaults to ‘horizontal’.

  • color_palette (str, optional) – Seaborn color palette to use. Defaults to ‘Set2’.

Returns:

Displays the boxplot visualization

Return type:

None

Raises:
  • ValueError – If orientation is not ‘horizontal’ or ‘vertical’

  • ValueError – If no numerical columns are found

Example

>>> import pandas as pd
>>> import edaflow
>>> df = pd.DataFrame({
...     'age': [25, 30, 35, 40, 100, 28, 32],  # 100 is outlier
...     'salary': [50000, 60000, 75000, 80000, 200000, 55000, 65000],  # 200000 is outlier
...     'experience': [2, 5, 8, 12, 25, 3, 6],
...     'category': ['A', 'B', 'A', 'C', 'B', 'A', 'C']
... })
>>>
>>> # Basic boxplot visualization
>>> edaflow.visualize_numerical_boxplots(df)
>>>
>>> # Custom layout and styling
>>> edaflow.visualize_numerical_boxplots(df,
...                                     rows=2, cols=2,
...                                     title="Custom Boxplots",
...                                     orientation='vertical',
...                                     color_palette='viridis')
>>>
>>> # Specific columns only
>>> edaflow.visualize_numerical_boxplots(df, columns=['age', 'salary'])
>>>
>>> # Alternative import style:
>>> from edaflow.analysis import visualize_numerical_boxplots
>>> visualize_numerical_boxplots(df, show_skewness=False)

Notes

  • Automatically identifies numerical columns (int64, float64, etc.)

  • Skips columns with all missing values

  • Outliers are clearly visible as points beyond the whiskers

  • Skewness interpretation: * |skewness| < 0.5: Approximately symmetric * 0.5 ≤ |skewness| < 1: Moderately skewed * |skewness| ≥ 1: Highly skewed

  • Uses seaborn styling for better visual appearance