edaflow.visualize_categorical_values
- edaflow.visualize_categorical_values(df: DataFrame, max_unique_values: int | None = 20, show_counts: bool = True, show_percentages: bool = True) None[source]
Visualize unique values in categorical (object-type) columns with counts and percentages.
This function provides a comprehensive overview of categorical columns by displaying: - Unique values in each categorical column - Value counts (frequency of each unique value) - Percentages (relative frequency) - Summary statistics for each column
- Parameters:
df (pd.DataFrame) – The input DataFrame to analyze
max_unique_values (Optional[int], optional) – Maximum number of unique values to display per column. If a column has more unique values, only the top N most frequent will be shown. Defaults to 20.
show_counts (bool, optional) – Whether to show the count of each unique value. Defaults to True.
show_percentages (bool, optional) – Whether to show the percentage of each unique value. Defaults to True.
- Returns:
Prints visualization results directly to console with formatting
- Return type:
None
Example
>>> import pandas as pd >>> import edaflow >>> df = pd.DataFrame({ ... 'category': ['A', 'B', 'A', 'C', 'B', 'A'], ... 'status': ['active', 'inactive', 'active', 'pending', 'active', 'active'], ... 'region': ['North', 'South', 'North', 'East', 'West', 'North'], ... 'score': [85, 92, 78, 88, 95, 82] ... }) >>> >>> # Basic visualization >>> edaflow.visualize_categorical_values(df) >>> >>> # Show only top 10 values per column, without percentages >>> edaflow.visualize_categorical_values(df, max_unique_values=10, show_percentages=False) >>> >>> # Alternative import style: >>> from edaflow.analysis import visualize_categorical_values >>> visualize_categorical_values(df, max_unique_values=15)
Notes
Only analyzes columns with object dtype (categorical/string columns)
Columns with many unique values are truncated to show most frequent ones
Provides summary statistics including total unique values and most common value
Uses color coding to highlight column names and important information