edaflow.summarize_eda_insights

edaflow.summarize_eda_insights(df: DataFrame, target_column: str | None = None, eda_functions_used: List[str] | None = None, class_threshold: float = 0.1) dict[source]

Generate comprehensive EDA insights and recommendations after completing analysis workflow.

This function analyzes the DataFrame and provides intelligent insights about: - Dataset characteristics and shape - Data quality assessment - Class distribution and imbalance detection - Missing data patterns - Feature type analysis - Actionable recommendations for modeling

Parameters:
  • df (pandas.DataFrame) – The DataFrame that has been analyzed

  • target_column (str, optional) – The name of the target column for classification/regression analysis

  • eda_functions_used (list of str, optional) – List of edaflow functions that have been executed

  • class_threshold (float, default 0.1) – Threshold below which a class is considered underrepresented (10%)

Returns:

Comprehensive insights dictionary with analysis results and recommendations

Return type:

dict

Examples

>>> import pandas as pd
>>> import edaflow
>>>
>>> # After completing EDA workflow
>>> df = pd.read_csv('healthcare_data.csv')
>>> # ... run various edaflow functions ...
>>>
>>> # Generate comprehensive insights
>>> insights = edaflow.summarize_eda_insights(df, target_column='ckd_status')
>>>
>>> # Insights with specific functions tracked
>>> functions_used = ['check_null_columns', 'analyze_categorical_columns',
...                   'visualize_histograms', 'handle_outliers_median']
>>> insights = edaflow.summarize_eda_insights(df, 'ckd_status', functions_used)