edaflow.impute_categorical_mode

edaflow.impute_categorical_mode(df, columns=None, inplace=False)[source]

Impute missing values in categorical columns using mode (most frequent value).

This function identifies categorical columns and fills missing values (NaN) with the mode (most frequent value) of each column. It provides detailed reporting of the imputation process and handles edge cases safely.

Parameters:
  • df (pandas.DataFrame) – The DataFrame containing data to impute

  • columns (list, optional) – Specific columns to impute. If None, all categorical columns will be processed

  • inplace (bool, default False) – If True, modify the original DataFrame. If False, return a new DataFrame

Returns:

If inplace=False, returns the DataFrame with imputed values If inplace=True, returns None and modifies the original DataFrame

Return type:

pandas.DataFrame or None

Examples

>>> import pandas as pd
>>> import edaflow
>>>
>>> # Create sample data with missing values
>>> df = pd.DataFrame({
...     'category': ['A', 'B', 'A', None, 'A'],
...     'status': ['Active', None, 'Active', 'Inactive', None],
...     'age': [25, 30, 35, 40, 45]
... })
>>>
>>> # Impute all categorical columns
>>> df_imputed = edaflow.impute_categorical_mode(df)
>>>
>>> # Impute specific columns only
>>> df_imputed = edaflow.impute_categorical_mode(df, columns=['category'])
>>>
>>> # Impute in place
>>> edaflow.impute_categorical_mode(df, inplace=True)