edaflow.analyze_categorical_columns
- edaflow.analyze_categorical_columns(df: DataFrame, threshold: float | None = 35) None[source]
Analyze categorical columns of object type to identify potential data issues.
This function examines object-type columns to detect: 1. Columns that might be numeric but stored as strings 2. Categorical columns with their unique values 3. Data type consistency issues
- Parameters:
df (pd.DataFrame) – The input DataFrame to analyze
threshold (Optional[float], optional) – The threshold percentage for non-numeric values. If a column has less than this percentage of non-numeric values, it’s flagged as potentially numeric. Defaults to 35.
- Returns:
Prints analysis results directly to console with rich color coding
- Return type:
None
Example
>>> import pandas as pd >>> import edaflow >>> df = pd.DataFrame({ ... 'name': ['Alice', 'Bob', 'Charlie'], ... 'age_str': ['25', '30', '35'], ... 'mixed': ['1', '2', 'three'], ... 'numbers': [1, 2, 3] ... }) >>> edaflow.analyze_categorical_columns(df, threshold=35) # Output with rich color coding and tables
# Alternative import style: >>> from edaflow.analysis import analyze_categorical_columns