13 repository-uri
Software suites providing implementations of boosting and bagging algorithms.
Distinguishing note: Focuses on library-style implementation of ensemble methods.
Explore 13 awesome GitHub repositories matching artificial intelligence & ml · Ensemble Learning Libraries. Refine with filters or upvote what's useful.
This project is an educational toolkit that provides implementations of fundamental machine learning algorithms built from scratch. By avoiding high-level library abstractions, it serves as a pedagogical reference for understanding the mathematical foundations and core mechanics of supervised learning, unsupervised learning, and reinforcement learning models. The repository distinguishes itself through a modular approach to model construction, allowing users to build custom neural networks by chaining independent functional blocks. It covers a wide range of techniques, including gradient-base
Ships a suite of boosting and bagging implementations to improve predictive performance.
XGBoost is a distributed machine learning library for implementing scalable gradient boosting decision trees used for regression, classification, and ranking. It functions as a predictive model framework and a cross-language toolkit, providing a core implementation with native bindings for Python, R, Java, Scala, and C++. The system is designed as a GPU-accelerated library that utilizes CUDA and NCCL to speed up the training of decision tree ensembles. It operates as a distributed framework capable of scaling training and prediction across multi-node clusters and GPU environments to process m
Provides a scalable framework for implementing gradient boosting decision tree ensembles for regression, classification, and ranking.
LightGBM is a gradient boosting framework used to train decision tree ensembles for classification, regression, and ranking tasks. It functions as a distributed machine learning library and a decision tree ensemble implementation that utilizes leaf-wise growth and histogram-based feature binning. The framework is distinguished by its ability to offload heavy computations to CUDA or OpenCL devices for GPU acceleration and its capacity to parallelize training across multiple nodes using sockets, MPI, or Dask. It includes a specialized categorical feature processor that optimizes partitions for
Provides a framework for constructing gradient-boosted decision tree ensembles that can be trained across multiple nodes.
LightGBM is a high-performance machine learning framework designed for constructing gradient-boosted decision tree ensembles. It provides a platform for training classification, regression, and ranking models, with a focus on memory efficiency and large-scale distributed computing. The framework distinguishes itself through specialized algorithmic strategies, including leaf-wise tree growth and histogram-based decision learning, which prioritize convergence speed. It optimizes memory usage by bundling mutually exclusive features and employs gradient-based sampling to reduce training complexit
Constructs classification, regression, and ranking models using gradient-boosted decision tree ensembles.
The algorithm-ml is a machine learning ranking engine designed to personalize content feeds by calculating relevance scores for items based on user interests and historical interaction data. It functions as a recommendation system that processes user behavior and item metadata to determine the optimal order of content for individual users. The system utilizes a multi-stage ranking architecture that filters large pools of candidate items into smaller sets before applying computationally expensive scoring models. It employs gradient-boosted decision tree ensembles to capture non-linear relation
Employs gradient-boosted decision tree ensembles to capture non-linear relationships in engagement data.
AutoGluon is an automated machine learning framework designed to optimize model selection and hyperparameter tuning across tabular, text, image, and time series data. It functions as an ensemble learning library and a tabular data prediction engine, aiming to build high-accuracy predictive models without manual algorithm selection. The framework integrates multimodal machine learning pipelines that combine disparate data types into a single representation using specialized encoders. It also includes a probabilistic time series forecaster that fits multiple statistical and deep learning models
Combines diverse machine learning models through weighting and stacking to reduce variance and improve generalisation.
This is a cross-platform framework for building, training, and deploying custom machine learning models within the .NET ecosystem. It provides a predictive modeling engine for classification, regression, and forecasting tasks, alongside an inference runtime to generate predictions across different hardware architectures. The framework includes a gradient boosting library and supports interoperability with external models via a standardized open format. It features tools for prediction explainability, allowing the analysis of feature importance to debug model behavior and identify bias. The p
Includes a library for implementing classical machine learning tasks using optimized gradient boosting algorithms.
CatBoost is a gradient boosting machine learning library used to train decision tree ensembles for regression, classification, and ranking tasks. It functions as a high-performance framework that provides a categorical data processor for transforming non-numeric features, a distributed trainer for large-scale datasets, and GPU acceleration to speed up model construction. The library distinguishes itself through native handling of categorical data and text features, removing the need for manual encoding. It includes a specialized model interpretability tool that leverages SHAP values and featu
Provides a high-performance library for training gradient-boosted decision tree ensembles for regression, classification, and ranking.
This project is a machine learning implementation library featuring a collection of code examples that implement supervised, unsupervised, and reinforcement learning algorithms from scratch. It provides a comprehensive set of toolkits for core machine learning components, including a natural language processing toolkit, a reinforcement learning framework, and suites for data dimensionality reduction and pattern mining. The library includes specialized implementations for reinforcement learning, such as Q-Learning, Deep Q-Networks, and Actor-Critic agents. The natural language processing capab
Implements optimized gradient boosting decision trees for efficient regression and classification tasks.
This project is a deep learning implementation library and neural network theory repository. It translates mathematical derivations from textbooks and literature into functional Python code to demonstrate how deep learning algorithms work. The codebase focuses on low-level algorithm implementation by using numerical libraries instead of high-level deep learning frameworks. This approach maps theoretical mathematical proofs to executable functions to verify principles and expose the underlying arithmetic and data flow of neural networks. The project covers the implementation of deep learning
Implements multiple model combination methods through a technical codebase of ensemble learning algorithms.
Orange3 is a visual data mining platform that provides an interactive canvas for building data analysis workflows without writing code. At its core, it offers a widget-based visual programming environment where users connect configurable components to perform data preprocessing, machine learning model training, statistical evaluation, and interactive visualization. The platform is built on NumPy-backed data tables with domain descriptors that define variable names, types, and roles, and includes a lazy SQL query proxy for working with database tables without loading all data into memory. The
Fits a gradient boosting classifier using optimized tree construction for high performance.
mlxtend is a pure Python machine learning extension library that provides additional tools for association rule mining, ensemble learning, and feature selection. It is built on numpy and pandas, with all data operations accepting and returning pandas DataFrames, and custom estimators inherit from scikit-learn’s base classes to offer a uniform fit-predict interface compatible with grid search. The library implements the Apriori algorithm for mining frequent itemsets from transaction data and generating association rules with confidence and lift metrics. For classification, it combines multiple
Ships an ensemble learning library for combining models via voting, stacking, and bagging.
SwanLab is an open-source machine learning experiment tracking platform and observability tool. It provides a centralized dashboard for logging training metrics, hyperparameters, and hardware performance to monitor and analyze AI model training runs. The platform is distinguished by its focus on self-hosted infrastructure, allowing users to deploy private instances via Docker or Kubernetes for secure on-premises data control. It also includes specialized utilities for migrating historical experiment logs and synchronizing real-time metrics from external tools like MLflow. The system covers a
Logs training metrics and model performance specifically for XGBoost runs to a centralized dashboard.