edaflow.analyze_categorical_columns

edaflow.analyze_categorical_columns(df: DataFrame, threshold: float | None = 35) None[source]

Analyze categorical columns of object type to identify potential data issues.

This function examines object-type columns to detect: 1. Columns that might be numeric but stored as strings 2. Categorical columns with their unique values 3. Data type consistency issues

Parameters:
  • df (pd.DataFrame) – The input DataFrame to analyze

  • threshold (Optional[float], optional) – The threshold percentage for non-numeric values. If a column has less than this percentage of non-numeric values, it’s flagged as potentially numeric. Defaults to 35.

Returns:

Prints analysis results directly to console with rich color coding

Return type:

None

Example

>>> import pandas as pd
>>> import edaflow
>>> df = pd.DataFrame({
...     'name': ['Alice', 'Bob', 'Charlie'],
...     'age_str': ['25', '30', '35'],
...     'mixed': ['1', '2', 'three'],
...     'numbers': [1, 2, 3]
... })
>>> edaflow.analyze_categorical_columns(df, threshold=35)
# Output with rich color coding and tables

# Alternative import style: >>> from edaflow.analysis import analyze_categorical_columns