User Guideο
Comprehensive guides for using edaflow effectively in your data analysis and machine learning workflows.
- Learning Path for Data Science with edaflow
- Data Quality & Cleaning
- Advanced Time Series Topics
- 1. Forecasting
- 2. Autocorrelation & Lag Analysis
- 3. Feature Engineering for Time Series
- 4. Integrating Time Series Models
- Getting Started
- Correlation Analysis
- Advanced & Interactive Plots
- More Visualization Examples
- Best Practices
- Feature Dependency Table:
- How edaflow Splits Your Dataset: Training, Validation, and Test
- Further Resources & FAQ
- Choosing the Right Performance Visualization
- Overview
- Data Validation: A Critical First Step
- Best practice: Aim for a high data quality score to ensure robust, reliable model results.
- Complete ML Workflow Example
- Individual Function Examples
- Why is it important?
- How does it work?
- Best Practices
- This approach ensures you have a solid reference point and helps you build more robust, trustworthy machine learning solutions.
- Widely Used Model Types in Machine Learning
- Refer to scikit-learn and the respective library documentation for more details and advanced options.
- Whatβs Next After Training the Model?
- Machine Learning Workflow with edaflow
- Advanced Features in edaflow
- Best Practices for edaflow
Overviewο
The edaflow User Guide is organized into five main sections:
Data Quality & Cleaningο
Learn how to assess data quality, handle missing values, convert data types, and prepare your data for analysis.
Missing data analysis and visualization
Categorical data insights and type conversion
Data imputation strategies
Outlier detection and handling
Visualization & Analysisο
Explore edaflowβs comprehensive visualization capabilities for understanding your data.
Distribution analysis with boxplots and histograms
Interactive visualizations with Plotly
Correlation and relationship analysis
Advanced scatter matrix analysis
Machine Learning Workflowsο
Master the complete ML pipeline with edaflowβs comprehensive machine learning functions.
ML experiment setup and data validation
Multi-model comparison and ranking systems
Hyperparameter optimization strategies
Performance visualization and model artifacts
Complete workflow examples and best practices
Advanced Featuresο
Discover advanced features and customization options for power users.
Custom thresholds and parameters
Integration with other libraries
Performance optimization tips
Extension and customization
Best Practicesο
Learn recommended workflows and best practices for effective EDA and ML.
Recommended EDA workflow
Memory and performance considerations
Jupyter notebook integration
Troubleshooting common issues
New Features & Advanced Visualizationο
edaflow now supports: - Faceted visualizations with display_facet_grid - Feature scaling with scale_features - Grouping rare categories with group_rare_categories - Exporting figures with export_figure
See the Visualization Guide and Advanced Features for details and examples.
External Library Requirementsο
Some features require additional libraries (seaborn, scikit-learn, statsmodels). See the Visualization Guide for installation instructions and troubleshooting tips.
Getting Startedο
If youβre new to edaflow, we recommend starting with the Quick Start Guide guide, then exploring each section of this user guide based on your specific needs.
For complete function documentation, see the API Reference.