30 open-source projects similar to interpretml/interpret, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Interpret alternative.
SHAP is a machine learning explainer that uses a game-theoretic framework to estimate the contribution of each feature to a model prediction. It provides a set of tools for quantifying how individual input features push a specific output away from a baseline value. The project includes specialized explainers for different architectures, including high-speed implementations for decision trees and ensemble models, linearization algorithms for deep learning networks, and covariance integration for linear models. It also features a model-agnostic interpretability tool that uses a kernel method to
This project is an agnostic model interpretability framework and explainability tool designed to provide local interpretable explanations for individual predictions. It functions as a local surrogate model that approximates the behavior of any machine learning classifier or regression model to identify the most influential features for a specific instance. The framework is designed to be model-agnostic, meaning it can explain predictions across tabular, text, and image data regardless of the underlying architecture. It employs local linear approximations and feature importance visualization t
CatBoost is a gradient boosting machine learning library used to train decision tree ensembles for regression, classification, and ranking tasks. It functions as a high-performance framework that provides a categorical data processor for transforming non-numeric features, a distributed trainer for large-scale datasets, and GPU acceleration to speed up model construction. The library distinguishes itself through native handling of categorical data and text features, removing the need for manual encoding. It includes a specialized model interpretability tool that leverages SHAP values and featu
This project is a comprehensive educational resource and technical manual focused on interpretable machine learning and explainable AI. It serves as a textbook and reference for implementing techniques that make complex machine learning models transparent and understandable to humans. The resource provides guidance on both building inherently transparent models, such as decision trees and sparse linear models, and applying post-hoc explanation methods to black-box systems. It details specific methodologies for quantifying feature importance, generating rationales for individual predictions, a
Lit is a machine learning interpretability framework and model debugging tool designed to analyze model behavior and performance. It serves as an interpretability dashboard for large language models and a general performance analyzer for text, image, and tabular datasets. The project distinguishes itself through a comprehensive suite of interpretability tools, including salience map generation for feature attribution, the creation of synthetic and counterfactual examples to test robustness, and the projection of high-dimensional embeddings into visual spaces via UMAP or PCA. It further enable
XGBoost is a distributed machine learning library for implementing scalable gradient boosting decision trees used for regression, classification, and ranking. It functions as a predictive model framework and a cross-language toolkit, providing a core implementation with native bindings for Python, R, Java, Scala, and C++. The system is designed as a GPU-accelerated library that utilizes CUDA and NCCL to speed up the training of decision tree ensembles. It operates as a distributed framework capable of scaling training and prediction across multi-node clusters and GPU environments to process m
Algorithms for explaining machine learning models
Captum is an open-source library for explaining model predictions by attributing them to input features, neurons, and layers using gradient-based and perturbation-based methods. It provides a modular framework for implementing, evaluating, and combining a range of explanation techniques, including gradient-based attribution, perturbation-based analysis, game-theoretic Shapley value approximation, and surrogate model explanations, with support for parallelization and noise stabilization. The library distinguishes itself through its breadth of attribution methods and its support for advanced in
SHAP is an explainable AI toolkit that provides a game theoretic framework for interpreting machine learning model predictions. It functions as a feature attribution engine, decomposing model outputs into the sum of individual feature effects to clarify how specific input variables influence a final decision. By assigning importance values to these inputs, the library enables users to understand the logic behind complex predictive models. The project distinguishes itself through its versatility and specialized calculation methods. It operates as a model-agnostic diagnostic library, capable of
Mmlspark is a distributed framework for executing machine learning models, data transformations, and AI service integrations across Apache Spark clusters. It functions as a distributed machine learning library and pipeline orchestrator, allowing users to integrate pre-trained cognitive services and custom models into large-scale batch and streaming workflows. The project is distinguished by its ability to incorporate external AI services and web APIs directly into big data pipelines for text and vision analysis. It provides a scalable model training framework that coordinates gradient boostin
The official implementation of "The Shapley Value of Classifiers in Ensemble Games" (CIKM 2021).
nlp-recipes is a collection of implementation guides and reference templates for applying natural language processing techniques to real-world tasks. It provides standardized workflows and code examples for developing NLP pipelines, from dataset preparation and model training to performance evaluation. The project focuses on the practical application of transformer-based models, offering patterns for fine-tuning pretrained architectures for tasks such as text classification, named entity recognition, and question answering. It also includes a toolkit for model interpretability, allowing users
This toolkit serves as a framework for interpreting the decision-making processes of graph neural networks. It functions as a library for analyzing how these models process complex network data, providing methods to identify the specific node attributes and structural patterns that influence predictive outcomes. The project distinguishes itself by employing mask-optimized subgraph extraction and gradient-based attribution mapping to isolate the minimal components of a graph that preserve a model's original prediction. By separating graph processing layers from explanation logic, the architect
This is a cross-platform framework for building, training, and deploying custom machine learning models within the .NET ecosystem. It provides a predictive modeling engine for classification, regression, and forecasting tasks, alongside an inference runtime to generate predictions across different hardware architectures. The framework includes a gradient boosting library and supports interoperability with external models via a standardized open format. It features tools for prediction explainability, allowing the analysis of feature importance to debug model behavior and identify bias. The p
AutoGluon is an automated machine learning framework and multimodal library designed to automate the end-to-end pipeline from data preprocessing to high-accuracy model training and validation. It functions as an automated model trainer for tabular, image, text, and time series data, as well as a tool for time series forecasting and foundation model finetuning. The project is distinguished by its ability to jointly process and fuse different data types, allowing for the construction of multimodal neural networks that integrate images, text, and structured tables. It supports zero-shot inferenc
PyCaret is a Python AutoML platform and MLOps lifecycle manager designed to automate machine learning workflows. It functions as a low-code environment that leverages a scikit-learn native engine to execute preprocessing, training, and evaluation for tabular data. The platform distinguishes itself as an LLM-powered ML copilot, using large language model agents to analyze datasets, design experiment configurations, and explain model results. It also serves as a Kubernetes ML orchestrator and model registry, enabling the versioning of trained pipelines and their promotion to production API endp
LightGBM is a gradient boosting framework used to train decision tree ensembles for classification, regression, and ranking tasks. It functions as a distributed machine learning library and a decision tree ensemble implementation that utilizes leaf-wise growth and histogram-based feature binning. The framework is distinguished by its ability to offload heavy computations to CUDA or OpenCL devices for GPU acceleration and its capacity to parallelize training across multiple nodes using sockets, MPI, or Dask. It includes a specialized categorical feature processor that optimizes partitions for
Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
A library for debugging/inspecting machine learning classifiers and explaining their predictions
⬛ Python Individual Conditional Expectation Plot Toolbox
🔅 Shapash: User-friendly Explainability and Interpretability to Develop Reliable and Transparent Machine Learning Models
Source code/webpage/demos for the What-If Tool
Lucid is a TensorFlow interpretability toolkit and visualization library designed to analyze the internal representations of neural networks. It functions as a gradient-based optimization framework that generates images and atlases to reveal the features learned by specific neurons and layers. The library enables the creation of activation atlases and the mapping of high-dimensional neural activations into lower-dimensional spaces to study model behavior. It utilizes differentiable image parametrization to optimize visual inputs that maximally activate network components. The system covers a
A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.
cuml is a GPU-accelerated machine learning library and framework that uses CUDA to accelerate tabular data preprocessing and model execution. It provides a suite of tools for training and deploying classification, regression, and clustering models on NVIDIA GPUs and GPU clusters. The library is designed for scalability, offering a distributed GPU machine learning environment that can spread computation and data across multiple hardware accelerators and nodes to handle datasets exceeding single-device memory. It mirrors standard estimator interfaces to allow the replacement of CPU-based models
Brain is a JavaScript library for building, training, and running feed-forward neural networks. It implements a multilayer perceptron model designed for pattern recognition and function approximation. The library includes a standalone inference engine that converts trained models into portable JavaScript functions. This allows predictions to be executed in browser or Node.js environments without requiring the original library dependencies. The system supports persistent model management through JSON serialization for saving and loading network weights. It also provides a streaming mechanism