edaflow.convert_to_numeric
- edaflow.convert_to_numeric(df: DataFrame, threshold: float | None = 35, inplace: bool = False) DataFrame[source]
Convert object columns to numeric when appropriate based on data analysis with rich formatting.
This function examines object-type columns and converts them to numeric if the percentage of non-numeric values is below the specified threshold. This helps clean datasets where numeric data is stored as strings.
- Parameters:
df (pd.DataFrame) – The input DataFrame to process
threshold (Optional[float], optional) – The threshold percentage for non-numeric values. Columns with fewer non-numeric values than this threshold will be converted to numeric. Defaults to 35.
inplace (bool, optional) – If True, modify the DataFrame in place and return None. If False, return a new DataFrame with conversions applied. Defaults to False.
- Returns:
- If inplace=False, returns a new DataFrame with
numeric conversions applied. If inplace=True, modifies the original DataFrame and returns None.
- Return type:
pd.DataFrame or None
Example
>>> import pandas as pd >>> import edaflow >>> df = pd.DataFrame({ ... 'name': ['Alice', 'Bob', 'Charlie'], ... 'age_str': ['25', '30', '35'], ... 'mixed': ['1', '2', 'three'], ... 'numbers': [1, 2, 3] ... }) >>> >>> # Create a copy with conversions >>> df_cleaned = edaflow.convert_to_numeric(df, threshold=35) >>> >>> # Or modify the original DataFrame >>> edaflow.convert_to_numeric(df, threshold=35, inplace=True) >>> >>> # Alternative import style: >>> from edaflow.analysis import convert_to_numeric >>> df_cleaned = convert_to_numeric(df, threshold=50)
Notes
Values that cannot be converted to numeric become NaN
The function provides colored output showing which columns were converted
Use a lower threshold to be more strict about conversions
Use a higher threshold to be more lenient about mixed data