API Reference
This section contains the complete API documentation for all edaflow functions.
Complete Function Index
Exploratory Data Analysis (EDA) Functions
Data Quality & Analysis
|
Check null values in DataFrame columns with rich styled output. |
|
Analyze categorical columns of object type to identify potential data issues. |
|
Convert object columns to numeric when appropriate based on data analysis with rich formatting. |
Display categorical and numerical columns in a DataFrame with rich formatting. |
|
|
Generate comprehensive EDA insights and recommendations after completing analysis workflow. |
Data Cleaning & Preprocessing
|
Impute missing values in numerical columns using median values with rich formatting. |
|
Impute missing values in categorical columns using mode (most frequent value). |
|
Replace outliers in numerical columns with the median value. |
Visualization & Analysis
|
Visualize unique values in categorical (object-type) columns with counts and percentages. |
|
Create boxplots for numerical columns to visualize distributions and outliers. |
|
Create interactive boxplots for numerical columns using Plotly Express. |
|
Create comprehensive heatmap visualizations for exploratory data analysis. |
|
Create comprehensive histogram visualizations with distribution analysis and skewness detection. |
|
Create comprehensive scatter matrix visualization for pairwise relationship analysis. |
Machine Learning (ML) Functions
ML Configuration & Setup
|
Set up a complete ML experiment with train/validation/test splits. |
|
Configure a preprocessing pipeline for the ML experiment. |
|
Validate data quality for ML experiments. |
Model Comparison & Ranking
|
Compare multiple models across various performance metrics. |
|
Rank models based on performance metrics. |
|
Display a visual leaderboard of model performance. |
|
Export model comparison results to file. |
Hyperparameter Optimization
|
Optimize hyperparameters using various search strategies. |
|
Perform grid search optimization for multiple models. |
|
Perform Bayesian optimization using scikit-optimize. |
|
Perform random search optimization for multiple models. |
Performance Visualization
|
Plot learning curves to analyze model performance vs training set size. |
|
Plot validation curves for hyperparameter analysis. |
|
Plot ROC curves for multiple models (binary classification only). |
|
Plot Precision-Recall curves for multiple models. |
|
Plot confusion matrix for a classification model. |
|
Plot feature importance for models that support it. |
Model Artifacts & Tracking
|
Save complete model artifacts including model, config, and metadata. |
|
Load model artifacts from saved files. |
|
Track experiment results in a CSV log file. |
|
Generate a comprehensive model report. |
Helper Functions