API Reference

This section contains the complete API documentation for all edaflow functions.

Complete Function Index

Exploratory Data Analysis (EDA) Functions

Data Quality & Analysis

check_null_columns(df[, threshold])

Check null values in DataFrame columns with rich styled output.

analyze_categorical_columns(df[, threshold])

Analyze categorical columns of object type to identify potential data issues.

convert_to_numeric(df[, threshold, inplace])

Convert object columns to numeric when appropriate based on data analysis with rich formatting.

display_column_types(df)

Display categorical and numerical columns in a DataFrame with rich formatting.

summarize_eda_insights(df[, target_column, ...])

Generate comprehensive EDA insights and recommendations after completing analysis workflow.

Data Cleaning & Preprocessing

impute_numerical_median(df[, columns, inplace])

Impute missing values in numerical columns using median values with rich formatting.

impute_categorical_mode(df[, columns, inplace])

Impute missing values in categorical columns using mode (most frequent value).

handle_outliers_median(df[, columns, ...])

Replace outliers in numerical columns with the median value.

Visualization & Analysis

visualize_categorical_values(df[, ...])

Visualize unique values in categorical (object-type) columns with counts and percentages.

visualize_numerical_boxplots(df[, columns, ...])

Create boxplots for numerical columns to visualize distributions and outliers.

visualize_interactive_boxplots(df[, ...])

Create interactive boxplots for numerical columns using Plotly Express.

visualize_heatmap(df[, heatmap_type, ...])

Create comprehensive heatmap visualizations for exploratory data analysis.

visualize_histograms(df[, columns, title, ...])

Create comprehensive histogram visualizations with distribution analysis and skewness detection.

visualize_scatter_matrix(df[, columns, ...])

Create comprehensive scatter matrix visualization for pairwise relationship analysis.

Machine Learning (ML) Functions

ML Configuration & Setup

setup_ml_experiment([data, target_column, ...])

Set up a complete ML experiment with train/validation/test splits.

configure_model_pipeline(data_config[, ...])

Configure a preprocessing pipeline for the ML experiment.

validate_ml_data([experiment_data, ...])

Validate data quality for ML experiments.

Model Comparison & Ranking

compare_models(models[, X_train, X_val, ...])

Compare multiple models across various performance metrics.

rank_models(comparison_df, primary_metric[, ...])

Rank models based on performance metrics.

display_leaderboard([comparison_results, ...])

Display a visual leaderboard of model performance.

export_model_comparison(comparison_df, filepath)

Export model comparison results to file.

Hyperparameter Optimization

optimize_hyperparameters(model, ...[, cv, ...])

Optimize hyperparameters using various search strategies.

grid_search_models(models, param_grids, ...)

Perform grid search optimization for multiple models.

bayesian_optimization(model, param_space, ...)

Perform Bayesian optimization using scikit-optimize.

random_search_models(models, ...[, n_iter, ...])

Perform random search optimization for multiple models.

Performance Visualization

plot_learning_curves(model, X_train, y_train)

Plot learning curves to analyze model performance vs training set size.

plot_validation_curves(model, X_train, ...)

Plot validation curves for hyperparameter analysis.

plot_roc_curves(models, X_val, y_val[, ...])

Plot ROC curves for multiple models (binary classification only).

plot_precision_recall_curves(models, X_val, ...)

Plot Precision-Recall curves for multiple models.

plot_confusion_matrix(model, X_val, y_val[, ...])

Plot confusion matrix for a classification model.

plot_feature_importance(model, feature_names)

Plot feature importance for models that support it.

Model Artifacts & Tracking

save_model_artifacts(model, model_name, ...)

Save complete model artifacts including model, config, and metadata.

load_model_artifacts(artifact_path[, ...])

Load model artifacts from saved files.

track_experiment(experiment_name, ...[, ...])

Track experiment results in a CSV log file.

create_model_report(model, model_name, ...)

Generate a comprehensive model report.

Helper Functions