edaflow.profile_report
- edaflow.profile_report(df: DataFrame, top_n_categorical: int = 5, output_format: str = 'html') Any[source]
Generate a comprehensive profiling report for a DataFrame.
This function creates an automated EDA report similar to ydata-profiling’s ProfileReport, including dataset overview, missing value analysis, categorical insights, and visualizations.
- Parameters:
- Returns:
- If output_format=”html”, returns path to HTML file.
If output_format=”dict”, returns dict with: - ‘overview’: DataFrame with dataset info - ‘summary_stats’: DataFrame with summary statistics - ‘missing_values’: DataFrame with null analysis - ‘categorical_insights’: Dict with category distributions - ‘numeric_insights’: Dict with numeric column info - ‘visualizations’: Dict with matplotlib figures
- Return type:
Any
- Raises:
ValueError – If df is empty or output_format is invalid
TypeError – If df is not a pandas DataFrame
Examples
>>> import pandas as pd >>> import edaflow >>> >>> # Create sample data >>> df = pd.DataFrame({ ... 'age': [25, 30, 35, 28, None, 45], ... 'salary': [50000, 60000, 70000, 55000, 65000, 80000], ... 'department': ['HR', 'IT', 'IT', 'HR', 'Finance', 'IT'], ... 'city': ['NYC', 'LA', 'NYC', 'LA', 'NYC', 'LA'] ... }) >>> >>> # Generate HTML report >>> report_path = edaflow.profile_report(df) >>> print(f"Report saved to: {report_path}") >>> >>> # Generate dict report >>> report_dict = edaflow.profile_report(df, output_format="dict") >>> print(report_dict['overview']) >>> >>> # Analyze top 3 categorical columns >>> report = edaflow.profile_report(df, top_n_categorical=3, output_format="dict") >>> print(report['categorical_insights'])
Alternative import: >>> from edaflow.analysis import profile_report >>> report = profile_report(df)