edaflow.convert_to_numeric

edaflow.convert_to_numeric(df: DataFrame, threshold: float | None = 35, inplace: bool = False) DataFrame[source]

Convert object columns to numeric when appropriate based on data analysis with rich formatting.

This function examines object-type columns and converts them to numeric if the percentage of non-numeric values is below the specified threshold. This helps clean datasets where numeric data is stored as strings.

Parameters:
  • df (pd.DataFrame) – The input DataFrame to process

  • threshold (Optional[float], optional) – The threshold percentage for non-numeric values. Columns with fewer non-numeric values than this threshold will be converted to numeric. Defaults to 35.

  • inplace (bool, optional) – If True, modify the DataFrame in place and return None. If False, return a new DataFrame with conversions applied. Defaults to False.

Returns:

If inplace=False, returns a new DataFrame with

numeric conversions applied. If inplace=True, modifies the original DataFrame and returns None.

Return type:

pd.DataFrame or None

Example

>>> import pandas as pd
>>> import edaflow
>>> df = pd.DataFrame({
...     'name': ['Alice', 'Bob', 'Charlie'],
...     'age_str': ['25', '30', '35'],
...     'mixed': ['1', '2', 'three'],
...     'numbers': [1, 2, 3]
... })
>>>
>>> # Create a copy with conversions
>>> df_cleaned = edaflow.convert_to_numeric(df, threshold=35)
>>>
>>> # Or modify the original DataFrame
>>> edaflow.convert_to_numeric(df, threshold=35, inplace=True)
>>>
>>> # Alternative import style:
>>> from edaflow.analysis import convert_to_numeric
>>> df_cleaned = convert_to_numeric(df, threshold=50)

Notes

  • Values that cannot be converted to numeric become NaN

  • The function provides colored output showing which columns were converted

  • Use a lower threshold to be more strict about conversions

  • Use a higher threshold to be more lenient about mixed data