# christophm/interpretable-ml-book

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/christophm-interpretable-ml-book).**

5,317 stars · 1,095 forks · Jupyter Notebook · NOASSERTION

## Links

- GitHub: https://github.com/christophM/interpretable-ml-book
- Homepage: https://christophm.github.io/interpretable-ml-book/
- awesome-repositories: https://awesome-repositories.com/repository/christophm-interpretable-ml-book.md

## Description

This project is a comprehensive educational resource and technical manual focused on interpretable machine learning and explainable AI. It serves as a textbook and reference for implementing techniques that make complex machine learning models transparent and understandable to humans.

The resource provides guidance on both building inherently transparent models, such as decision trees and sparse linear models, and applying post-hoc explanation methods to black-box systems. It details specific methodologies for quantifying feature importance, generating rationales for individual predictions, and using surrogate models to approximate complex decision-making processes.

The content covers a wide range of analytical capabilities, including global and local feature influence analysis, computer vision interpretability, and the use of game-theoretic contributions like Shapley values. It also addresses model evaluation through interpretability assessments, debugging workflows to identify model shortcuts, and the design of transparent algorithm structures.

The project is implemented as a collection of Jupyter Notebooks.

## Tags

### Artificial Intelligence & ML

- [Model Explainability](https://awesome-repositories.com/f/artificial-intelligence-ml/model-predictions/model-explainability.md) — Provides a comprehensive set of techniques for interpreting model decisions and quantifying feature contributions to predictions. ([source](https://cdn.jsdelivr.net/gh/christophm/interpretable-ml-book@master/README.md))
- [Shapley Value Calculators](https://awesome-repositories.com/f/artificial-intelligence-ml/shapley-value-calculators.md) — Provides tools and algorithms for computing Shapley values to interpret model predictions. ([source](https://christophm.github.io/interpretable-ml-book/shapley.html))
- [Transparent Algorithm Design](https://awesome-repositories.com/f/artificial-intelligence-ml/transparent-algorithm-design.md) — Provides a comprehensive guide to building models with inherently interpretable structures.
- [Black Box Model Approximation](https://awesome-repositories.com/f/artificial-intelligence-ml/black-box-model-approximation.md) — Trains interpretable models to approximate and reveal the decision-making process of black-box systems. ([source](https://christophm.github.io/interpretable-ml-book/global.html))
- [Explainable AI](https://awesome-repositories.com/f/artificial-intelligence-ml/explainable-ai.md) — Serves as a technical manual for interpreting model decisions and quantifying feature importance.
- [Feature Contribution Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-contribution-analysis.md) — Provides methods for estimating the influence of individual features on model predictions using game-theoretic marginal contributions.
- [Feature Importance Attribution](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-importance-attribution.md) — Quantifies the relative contribution of individual input variables using Shapley values and permutation importance.
- [Feature Influence Visualization](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-influence-visualization.md) — Plots how changes in a single feature affect a prediction while keeping other features constant. ([source](https://christophm.github.io/interpretable-ml-book/ceteris-paribus.html))
- [Feature Interaction Analyzers](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-interaction-analyzers.md) — Provides tools to analyze higher-order dependencies and interaction effects between input features. ([source](https://christophm.github.io/interpretable-ml-book/ice.html))
- [Attribution Gradients](https://awesome-repositories.com/f/artificial-intelligence-ml/gradient-computation/attribution-gradients.md) — Calculates feature importance by backpropagating model outputs to visualize influential input regions.
- [Sparse Linear Model Fitting](https://awesome-repositories.com/f/artificial-intelligence-ml/linear-regression-models/sparse-linear-model-fitting.md) — Combines features and decision rules into a linear model using regularization to maintain sparsity. ([source](https://christophm.github.io/interpretable-ml-book/rulefit.html))
- [Local Surrogate Models](https://awesome-repositories.com/f/artificial-intelligence-ml/local-surrogate-models.md) — Trains simple interpretable models to approximate and explain the behavior of complex black-box predictions.
- [Recursive Partitioning Trees](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/algorithms/predictive-machine-learning-analytics/regression-ensembles/recursive-partitioning-trees.md) — Implements decision trees that split data recursively to create transparent hierarchical prediction paths.
- [ML Model Debugging Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/ml-model-debugging-tools.md) — Identifies shortcuts or errors in model behavior by detecting predictions that contradict domain knowledge. ([source](https://christophm.github.io/interpretable-ml-book/goals.html))
- [Model Behavioral Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/model-behavioral-analysis.md) — Describes average model behavior across a dataset using importance rankings and effect plots. ([source](https://christophm.github.io/interpretable-ml-book/overview.html))
- [Model Interpretability Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/model-interpretability-tools.md) — Provides a reference for implementing surrogate models and Shapley values to analyze black-box behavior.
- [Interpretable Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/model-interpretability/interpretable-model-training.md) — Details processes for fitting models that maintain human-understandable logic throughout the training phase. ([source](https://cdn.jsdelivr.net/gh/christophm/interpretable-ml-book@master/README.md))
- [Perturbation-Based Sampling](https://awesome-repositories.com/f/artificial-intelligence-ml/perturbation-based-sampling.md) — Uses synthetic data neighbors created by altering feature values to probe model stability and contributions.
- [Feature-Value Relationship Visualization](https://awesome-repositories.com/f/artificial-intelligence-ml/shapley-value-calculators/feature-value-relationship-visualization.md) — Produces plots mapping feature values to Shapley values to reveal input-output relationships. ([source](https://christophm.github.io/interpretable-ml-book/limo.html))
- [Additive-Component Decompositions](https://awesome-repositories.com/f/artificial-intelligence-ml/additive-component-decompositions.md) — Implements techniques to decompose complex prediction functions into additive components to isolate feature influence.
- [Concept](https://awesome-repositories.com/f/artificial-intelligence-ml/backbone-integrations/adapter-layers/bottleneck-layers/concept.md) — Maps inputs to a set of concepts in a bottleneck layer to enable counterfactual explanations. ([source](https://christophm.github.io/interpretable-ml-book/detecting-concepts.html))
- [Concept-Bottleneck Mappings](https://awesome-repositories.com/f/artificial-intelligence-ml/concept-bottleneck-mappings.md) — Details how to map high-dimensional inputs to human-understandable concepts to improve model transparency.
- [Evaluation Datasets](https://awesome-repositories.com/f/artificial-intelligence-ml/dataset-management/evaluation-datasets.md) — Supplies processed regression and classification datasets used for testing the transparency of models. ([source](https://christophm.github.io/interpretable-ml-book/data.html))
- [Representative Subset Selection Algorithms](https://awesome-repositories.com/f/artificial-intelligence-ml/dataset-subset-extractions/representative-subset-selection-algorithms.md) — Provides algorithms for selecting diverse and representative subsets from datasets to preserve distribution coverage. ([source](https://christophm.github.io/interpretable-ml-book/proto.html))
- [Decision List Construction](https://awesome-repositories.com/f/artificial-intelligence-ml/decision-list-construction.md) — Details the construction of decision lists that iteratively explain datasets through sequential rules. ([source](https://christophm.github.io/interpretable-ml-book/rules.html))
- [Decision Trees](https://awesome-repositories.com/f/artificial-intelligence-ml/decision-trees.md) — Implements decision tree models that map observations to target values via recursive splitting. ([source](https://christophm.github.io/interpretable-ml-book/tree.html))
- [Neuron Activation Visualization](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-learning-architectures/visual-feature-extractors/neuron-activation-visualization.md) — Identifies inputs that maximize the activation of specific neurons to reveal learned visual patterns. ([source](https://christophm.github.io/interpretable-ml-book/cnn-features.html))
- [Dependent Feature Decomposition](https://awesome-repositories.com/f/artificial-intelligence-ml/dependent-feature-decomposition.md) — Uses functional ANOVA to decompose models with correlated features using hierarchical orthogonality. ([source](https://christophm.github.io/interpretable-ml-book/decomposition.html))
- [Domain Insight Extraction](https://awesome-repositories.com/f/artificial-intelligence-ml/domain-insight-extraction.md) — Provides methodologies to uncover scientific patterns or business drivers by analyzing relationships between input features and targets. ([source](https://christophm.github.io/interpretable-ml-book/goals.html))
- [Conditional Feature Importance](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-importance-attribution/sparse-and-dense-feature-importance/conditional-feature-importance.md) — Evaluates a feature's unique contribution by sampling from conditional distributions to account for dependencies. ([source](https://christophm.github.io/interpretable-ml-book/feature-importance.html))
- [Feature Interaction Models](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-interaction-models.md) — Implements models that capture combined effects between features through interaction terms and varying slopes. ([source](https://christophm.github.io/interpretable-ml-book/extend-lm.html))
- [Interaction Effect Calculation](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-interaction-models/interaction-effect-calculation.md) — Computes numerical values for feature interactions to reveal hidden dependencies in model predictions. ([source](https://christophm.github.io/interpretable-ml-book/interaction.html))
- [Causal Effect Decompositions](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-interaction-models/interaction-effect-calculation/causal-effect-decompositions.md) — Produces a functional decomposition to isolate feature effects without entangling correlated features. ([source](https://christophm.github.io/interpretable-ml-book/decomposition.html))
- [Second-Order Interaction Analyses](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-interaction-models/interaction-effect-calculation/second-order-interaction-analyses.md) — Identifies second-order effects by isolating interaction between two features from their main effects. ([source](https://christophm.github.io/interpretable-ml-book/ale.html))
- [Frequent Pattern Extraction](https://awesome-repositories.com/f/artificial-intelligence-ml/frequent-pattern-extraction.md) — Implements techniques to identify common combinations of feature values used to establish decision rules. ([source](https://christophm.github.io/interpretable-ml-book/rules.html))
- [Functional ANOVA Execution](https://awesome-repositories.com/f/artificial-intelligence-ml/functional-anova-execution.md) — Decomposes a function into orthogonal components to isolate interaction effects from main effects. ([source](https://christophm.github.io/interpretable-ml-book/decomposition.html))
- [Interpretability Assessments](https://awesome-repositories.com/f/artificial-intelligence-ml/interpretability-assessments.md) — Assesses explanation quality using tests with end users, laypersons, or proxy tasks. ([source](https://christophm.github.io/interpretable-ml-book/evaluation.html))
- [Logistic Regression Weight Interpretations](https://awesome-repositories.com/f/artificial-intelligence-ml/linear-regression-models/linear-model-interpretability/logistic-regression-weight-interpretations.md) — Translates model weights into odds ratios to determine how features affect binary outcomes. ([source](https://christophm.github.io/interpretable-ml-book/logistic.html))
- [L1 Regularization](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/algorithms/regression-models/regularization-techniques/l1-regularization.md) — Uses L1 regularization to constrain the number of active features, ensuring human-readable model complexity.
- [Interpretability-Based Debugging](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-evaluation-and-validation/model-selection-and-validation/interpretability-based-debugging.md) — Identifies shortcuts or errors in model behavior by comparing predictions against domain knowledge and data distributions.
- [Marginal Effect Visualizations](https://awesome-repositories.com/f/artificial-intelligence-ml/marginal-effect-visualizations.md) — Plots the average relationship between features and outcomes in the project by marginalizing over other distributions. ([source](https://christophm.github.io/interpretable-ml-book/pdp.html))
- [Glassbox Model Construction](https://awesome-repositories.com/f/artificial-intelligence-ml/model-interpretability/glassbox-model-construction.md) — Provides guidance on constructing models that are inherently transparent and interpretable by design. ([source](https://christophm.github.io/interpretable-ml-book/decomposition.html))
- [Training Data Influence Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/model-interpretability/interpretable-model-training/training-data-influence-analysis.md) — Identifies which training examples most influenced specific predictions to debug model behavior. ([source](https://christophm.github.io/interpretable-ml-book/influential.html))
- [Concept Attribution Methods](https://awesome-repositories.com/f/artificial-intelligence-ml/model-predictions/concept-attribution-methods.md) — Quantifies how much a human-defined concept influences a model's prediction for a particular class. ([source](https://christophm.github.io/interpretable-ml-book/detecting-concepts.html))
- [Functional Decompositions](https://awesome-repositories.com/f/artificial-intelligence-ml/model-predictions/model-explainability/functional-decompositions.md) — Breaks complex models into a sum of additive components and interaction terms for transparency. ([source](https://christophm.github.io/interpretable-ml-book/decomposition.html))
- [Model Weight Visualizations](https://awesome-repositories.com/f/artificial-intelligence-ml/model-weight-visualizations.md) — Generates plots of coefficients and confidence intervals to identify significant prediction drivers. ([source](https://christophm.github.io/interpretable-ml-book/limo.html))
- [Neural Network Interpretability](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-interpretability.md) — Links activated neural network channels with human-understandable concepts to measure detection quality. ([source](https://christophm.github.io/interpretable-ml-book/cnn-features.html))
- [Nonlinear Relationship Modeling](https://awesome-repositories.com/f/artificial-intelligence-ml/nonlinear-relationship-modeling.md) — Fits non-linear patterns using feature transformations or smooth curves to represent complex real-world trends. ([source](https://christophm.github.io/interpretable-ml-book/extend-lm.html))
- [Explanation Property Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/perturbation-based-sampling/explanation-fidelity-evaluation/explanation-property-analysis.md) — Measures the effectiveness of explanations based on their expressive power and algorithmic complexity. ([source](https://christophm.github.io/interpretable-ml-book/evaluation.html))
- [Prediction Justification Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/prediction-justification-frameworks.md) — Provides frameworks for explaining the reasoning behind specific model outputs to ensure transparency and recourse. ([source](https://christophm.github.io/interpretable-ml-book/goals.html))
- [Prediction-Specific Rule Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/prediction-specific-rule-generators.md) — Provides techniques for generating rules that isolate the specific feature values driving a particular classification. ([source](https://christophm.github.io/interpretable-ml-book/anchors.html))
- [Architecture Interpretation](https://awesome-repositories.com/f/artificial-intelligence-ml/reasoning-models/internal-monologue-modeling/architecture-interpretation.md) — Analyzes internal architecture, weights, and gradients to uncover learned concepts within models. ([source](https://christophm.github.io/interpretable-ml-book/overview.html))
- [Saliency Map Localizations](https://awesome-repositories.com/f/artificial-intelligence-ml/saliency-map-localizations.md) — Generates coarse heatmaps by backpropagating gradients in the project to identify regions driving a classification. ([source](https://christophm.github.io/interpretable-ml-book/pixel-attribution.html))
- [Pixel Saliency Maps](https://awesome-repositories.com/f/artificial-intelligence-ml/saliency-mapping/pixel-saliency-maps.md) — Calculates gradients of a class score to visualize image regions influencing a prediction. ([source](https://christophm.github.io/interpretable-ml-book/pixel-attribution.html))
- [Underrepresented Data Point Identification](https://awesome-repositories.com/f/artificial-intelligence-ml/underrepresented-data-point-identification.md) — Finds data instances poorly represented by prototypes to reveal gaps in data coverage. ([source](https://christophm.github.io/interpretable-ml-book/proto.html))

### Data & Databases

- [Counterfactual Explanation Generations](https://awesome-repositories.com/f/data-databases/tabular-data-frameworks/tabular-predictive-models/tabular-explanations/counterfactual-explanation-generations.md) — Identifies the minimal changes to input features required to change a model's prediction. ([source](https://christophm.github.io/interpretable-ml-book/counterfactual.html))
- [Tabular Model Explanation](https://awesome-repositories.com/f/data-databases/tabular-data-frameworks/tabular-predictive-models/tabular-model-explanation.md) — Includes analysis of how numerical and categorical features contribute to predictions, particularly for tabular datasets. ([source](https://christophm.github.io/interpretable-ml-book/what-is-machine-learning.html))

### Education & Learning Resources

- [Machine Learning Books](https://awesome-repositories.com/f/education-learning-resources/educational-resources/ai-learning-resources/ai-machine-learning-tutorials/machine-learning-books.md) — Provides a comprehensive textbook on techniques to make machine learning models transparent and understandable.
- [Machine Learning Educational Resources](https://awesome-repositories.com/f/education-learning-resources/machine-learning-educational-resources.md) — Offers theoretical and practical instructions for building inherently transparent models.
- [Global Impact Rankings](https://awesome-repositories.com/f/education-learning-resources/technical-domain-education/technical-academic-domains/algorithmic-design-analysis/algorithms-and-design-patterns/memory-efficient-algorithms/language-specific-memory-analysis/feature-impact-analysis/global-impact-rankings.md) — Ranks features by averaging absolute Shapley values to determine global model impact. ([source](https://christophm.github.io/interpretable-ml-book/shap.html))
- [Tree Model Interpretability](https://awesome-repositories.com/f/education-learning-resources/tree-data-structures/array-based-tree-modeling/tree-model-interpretability.md) — Provides exact attribution methods for decision trees and ensemble models like Random Forests or XGBoost. ([source](https://christophm.github.io/interpretable-ml-book/decomposition.html))
- [Visual Concept Discovery](https://awesome-repositories.com/f/education-learning-resources/visual-concept-discovery.md) — Clusters image segments to automatically identify influential visual concepts within a dataset. ([source](https://christophm.github.io/interpretable-ml-book/detecting-concepts.html))

### Scientific & Mathematical Computing

- [Feature Influence Quantifications](https://awesome-repositories.com/f/scientific-mathematical-computing/feature-weighting-algorithms/feature-influence-quantifications.md) — Implements techniques to quantify how specific input variables influence model predictions. ([source](https://christophm.github.io/interpretable-ml-book/limo.html))
- [Tree Ensemble Rule Extraction](https://awesome-repositories.com/f/scientific-mathematical-computing/symbolic-regression/tree-ensemble-rule-extraction.md) — Implements the extraction of binary interaction rules from complex tree ensembles. ([source](https://christophm.github.io/interpretable-ml-book/rulefit.html))

### Part of an Awesome List

- [Bayesian Rule List Learning](https://awesome-repositories.com/f/awesome-lists/ai/bayesian-machine-learning/bayesian-rule-list-learning.md) — Implements Bayesian rule list learning to create highly interpretable decision logic. ([source](https://christophm.github.io/interpretable-ml-book/rules.html))

### Software Engineering & Architecture

- [Leave-One-Feature-Out Importance](https://awesome-repositories.com/f/software-engineering-architecture/modular-feature-architectures/tree-shakable-feature-imports/leave-one-feature-out-importance.md) — Measures feature contributions by retraining the model without specific features and comparing performance. ([source](https://christophm.github.io/interpretable-ml-book/lofo.html))

### Testing & Quality Assurance

- [Explanation Fidelity Testing](https://awesome-repositories.com/f/testing-quality-assurance/model-testing/model-evaluation/explanation-fidelity-testing.md) — Validates the faithfulness of model explanations by checking accuracy, fidelity, and consistency. ([source](https://christophm.github.io/interpretable-ml-book/evaluation.html))

### User Interface & Experience

- [Individual Conditional Expectation Curves](https://awesome-repositories.com/f/user-interface-experience/feature-interaction-visualizations/partial-dependence-plots/individual-conditional-expectation-curves.md) — Provides visualizations showing how predictions change for individual data points as specific features vary. ([source](https://christophm.github.io/interpretable-ml-book/ice.html))
- [Partial Dependence Importance Metrics](https://awesome-repositories.com/f/user-interface-experience/feature-interaction-visualizations/partial-dependence-plots/partial-dependence-importance-metrics.md) — Quantifies feature importance by measuring the variance of its partial dependence curve. ([source](https://christophm.github.io/interpretable-ml-book/pdp.html))
