What are the best open-source GitHub repositories for a gradient boosting library for tabular data?

Question 1

Accepted Answer

lightgbm-org/lightgbm is the closest match — LightGBM is a gradient boosting framework built for decision-tree ensembles on tabular data, with native support for categorical features, missing values, regularization, early stopping, GPU acceleration, and feature importance ranking — directly matching this search for a dedicated gradient boosting library.. Other strong matches: dmlc/xgboost, catboost/catboost, h2oai/h2o-3, microsoft/lightgbm.

Question 2

Why does lightgbm-org/lightgbm match “a gradient boosting library for tabular data”?

lightgbm-org · Accepted Answer

LightGBM is a gradient boosting framework built for decision-tree ensembles on tabular data, with native support for categorical features, missing values, regularization, early stopping, GPU acceleration, and feature importance ranking — directly matching this search for a dedicated gradient boosti…

Question 3

Why does dmlc/xgboost match “a gradient boosting library for tabular data”?

dmlc · Accepted Answer

XGBoost is a distributed gradient boosting library built specifically for decision tree ensembles on tabular data, supporting GPU acceleration, regularization, early stopping, categorical features, missing values, and feature importance — exactly matching the core capability and most sought-after f…

Question 4

Why does catboost/catboost match “a gradient boosting library for tabular data”?

catboost · Accepted Answer

CatBoost is a dedicated gradient boosting library designed specifically for tabular data with native handling of categorical features and missing values, plus GPU acceleration, regularization, early stopping, and built-in feature importance tools—exactly matching this search for a structured/tabula…

Question 5

Why does h2oai/h2o-3 match “a gradient boosting library for tabular data”?

h2oai · Accepted Answer

H2O-3 is a distributed machine learning platform with a mature gradient boosting (GBM) implementation that natively supports missing values, categorical features, regularization, early stopping, GPU acceleration, feature importance, and cross-validation, making it a comprehensive and production-rea…

Question 6

Why does microsoft/lightgbm match “a gradient boosting library for tabular data”?

microsoft · Accepted Answer

LightGBM is a high-performance gradient boosting library purpose-built for structured/tabular data, with native support for all the requested features including decision-tree base learners, missing value handling, categorical features, regularization, early stopping, GPU acceleration, feature impor…

Gradient Boosting Libraries for Tabular Data

lightgbm-org/LightGBM

dmlc/xgboost

catboost/catboost

h2oai/h2o-3

microsoft/LightGBM

rapidsai/cuml

scikit-learn/scikit-learn