1 repositorio
Automated handling and transformation of non-numeric features for machine learning training.
Distinct from Data Processing: Specifically targets categorical feature handling for ML rather than general data pipeline processing.
Explore 1 awesome GitHub repository matching data & databases · Categorical. Refine with filters or upvote what's useful.
CatBoost is a gradient boosting machine learning library used to train decision tree ensembles for regression, classification, and ranking tasks. It functions as a high-performance framework that provides a categorical data processor for transforming non-numeric features, a distributed trainer for large-scale datasets, and GPU acceleration to speed up model construction. The library distinguishes itself through native handling of categorical data and text features, removing the need for manual encoding. It includes a specialized model interpretability tool that leverages SHAP values and featu
Automatically processes non-numeric features during training to remove the need for manual data encoding.