edaflow.summarize_eda_insights
- edaflow.summarize_eda_insights(df: DataFrame, target_column: str | None = None, eda_functions_used: List[str] | None = None, class_threshold: float = 0.1) dict[source]
Generate comprehensive EDA insights and recommendations after completing analysis workflow.
This function analyzes the DataFrame and provides intelligent insights about: - Dataset characteristics and shape - Data quality assessment - Class distribution and imbalance detection - Missing data patterns - Feature type analysis - Actionable recommendations for modeling
- Parameters:
df (pandas.DataFrame) – The DataFrame that has been analyzed
target_column (str, optional) – The name of the target column for classification/regression analysis
eda_functions_used (list of str, optional) – List of edaflow functions that have been executed
class_threshold (float, default 0.1) – Threshold below which a class is considered underrepresented (10%)
- Returns:
Comprehensive insights dictionary with analysis results and recommendations
- Return type:
Examples
>>> import pandas as pd >>> import edaflow >>> >>> # After completing EDA workflow >>> df = pd.read_csv('healthcare_data.csv') >>> # ... run various edaflow functions ... >>> >>> # Generate comprehensive insights >>> insights = edaflow.summarize_eda_insights(df, target_column='ckd_status') >>> >>> # Insights with specific functions tracked >>> functions_used = ['check_null_columns', 'analyze_categorical_columns', ... 'visualize_histograms', 'handle_outliers_median'] >>> insights = edaflow.summarize_eda_insights(df, 'ckd_status', functions_used)