edaflow.impute_categorical_mode
- edaflow.impute_categorical_mode(df, columns=None, inplace=False)[source]
Impute missing values in categorical columns using mode (most frequent value).
This function identifies categorical columns and fills missing values (NaN) with the mode (most frequent value) of each column. It provides detailed reporting of the imputation process and handles edge cases safely.
- Parameters:
df (pandas.DataFrame) – The DataFrame containing data to impute
columns (list, optional) – Specific columns to impute. If None, all categorical columns will be processed
inplace (bool, default False) – If True, modify the original DataFrame. If False, return a new DataFrame
- Returns:
If inplace=False, returns the DataFrame with imputed values If inplace=True, returns None and modifies the original DataFrame
- Return type:
pandas.DataFrame or None
Examples
>>> import pandas as pd >>> import edaflow >>> >>> # Create sample data with missing values >>> df = pd.DataFrame({ ... 'category': ['A', 'B', 'A', None, 'A'], ... 'status': ['Active', None, 'Active', 'Inactive', None], ... 'age': [25, 30, 35, 40, 45] ... }) >>> >>> # Impute all categorical columns >>> df_imputed = edaflow.impute_categorical_mode(df) >>> >>> # Impute specific columns only >>> df_imputed = edaflow.impute_categorical_mode(df, columns=['category']) >>> >>> # Impute in place >>> edaflow.impute_categorical_mode(df, inplace=True)