This is a scikit-learn automated machine learning framework designed to optimize model selection and hyperparameters. It functions as an automated model selector and hyperparameter optimization tool for classification and regression tasks, utilizing an automated ensemble builder to combine high-performing models for increased predictive accuracy.
The system features a distributed search engine that uses Dask for parallel machine learning optimization across CPU cores or clusters. It implements a budget-based evaluation strategy through successive halving to prioritize promising model configurations and manages large-scale searches with resource consumption limits and dataset compression.
The framework covers a broad capability surface including automated data preprocessing for text and sparse data, multi-objective metric optimization, and search space restriction. It also provides monitoring tools for accuracy tracking and model leaderboard visualization to interpret the search process.
The software is available as a pre-configured container environment via Docker to simplify deployment.