# rasbt/python-machine-learning-book

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/rasbt-python-machine-learning-book).**

12,614 stars · 4,383 forks · Jupyter Notebook · MIT

## Links

- GitHub: https://github.com/rasbt/python-machine-learning-book
- awesome-repositories: https://awesome-repositories.com/repository/rasbt-python-machine-learning-book.md

## Description

This project is an educational resource providing practical code examples and implementations of machine learning algorithms using the Python language. It serves as a guide for constructing predictive pipelines, clustering models, and dimensionality reduction within the Scikit-Learn ecosystem.

The repository includes comprehensive demonstrations for supervised and unsupervised learning, as well as detailed examples for implementing neural networks and deep architectures. It also provides practical guidance on exporting model parameters to JSON and wrapping trained models in web APIs for production deployment.

The content covers a broad range of capabilities including data preprocessing, feature engineering, and model evaluation. It details various modeling approaches such as ensemble learning, natural language processing, reinforcement learning, and mathematical optimization.

The materials are presented as a collection of Jupyter Notebooks and code implementations.

## Tags

### Education & Learning Resources

- [Machine Learning Educational Resources](https://awesome-repositories.com/f/education-learning-resources/machine-learning-educational-resources.md) — Provides a comprehensive collection of annotated code samples for learning machine learning algorithms and workflows.
- [Scikit-Learn Examples](https://awesome-repositories.com/f/education-learning-resources/supervised-learning-examples/scikit-learn-examples.md) — Provides a practical guide for building predictive pipelines and clustering models using the Scikit-Learn ecosystem.
- [Deep Neural Network Training Optimization](https://awesome-repositories.com/f/education-learning-resources/technical-interview-preparation/ml-interview-preparation/deep-learning-review/deep-neural-network-training-optimization.md) — Demonstrates the use of backpropagation and optimization algorithms to train deep networks and mitigate vanishing gradients. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/difference-deep-and-normal-learning.md))
- [Softmax Regression](https://awesome-repositories.com/f/education-learning-resources/educational-resources/systems-applied-computing/machine-learning-education/softmax-regression.md) — Provides implementation and theory for softmax regression models used in multi-class classification. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/bonus/softmax-regression.ipynb))

### Artificial Intelligence & ML

- [Cost Functions](https://awesome-repositories.com/f/artificial-intelligence-ml/cost-functions.md) — Implements mathematical objectives to measure and minimize the discrepancy between predicted and actual values. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch02))
- [Data Preparation](https://awesome-repositories.com/f/artificial-intelligence-ml/data-preparation.md) — Provides utilities for cleaning, normalizing, and structuring datasets to ensure compatibility with predictive models. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch01))
- [Data Processing Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/data-processing-pipelines.md) — Constructs comprehensive data processing pipelines for transforming and cleaning raw data for ML models. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/ml-curriculum.md))
- [Dataset Splitting Utilities](https://awesome-repositories.com/f/artificial-intelligence-ml/dataset-management/evaluation-datasets/dataset-splitting-utilities.md) — Provides utilities for partitioning datasets into separate training and testing sets to evaluate generalization. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch04))
- [Data Cleansing](https://awesome-repositories.com/f/artificial-intelligence-ml/dataset-preparation-utilities/data-cleansing.md) — Implements processes for removing duplicates and handling missing values to ensure high dataset quality. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/dataprep-vs-dataengin.md))
- [Dimensionality Reduction Techniques](https://awesome-repositories.com/f/artificial-intelligence-ml/dimensionality-reduction-techniques.md) — Implements algorithms and methods for reducing the number of input variables to prevent overfitting and improve efficiency. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/classifier-history.md))
- [Ensemble Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/ensemble-learning.md) — Combines multiple learning algorithms through bagging, boosting, and stacking to create stronger predictive models. ([source](https://github.com/rasbt/python-machine-learning-book#readme))
- [Feature Engineering](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-engineering.md) — Provides techniques for feature engineering including dimensionality reduction and scaling to optimize model performance. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/dataprep-vs-dataengin.md))
- [Feature Scale Normalization](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-scale-normalization.md) — Implements feature scaling techniques to standardize data ranges for stable gradient descent convergence.
- [Kernel-Based Feature Mapping](https://awesome-repositories.com/f/artificial-intelligence-ml/kernel-based-feature-mapping.md) — Demonstrates the use of kernel-based feature mapping to resolve non-linear patterns in datasets.
- [Machine Learning Workflow Libraries](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning-workflow-libraries.md) — Provides standardized pipelines for sequencing data preprocessing and model training for consistent workflows. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/bonus/svm_iris_pipeline_and_gridsearch.ipynb))
- [Multi-Layer Perceptrons](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/model-construction/neural-network-layers/convolution-layers/layered-architectures/multi-layer-perceptrons.md) — Implements feedforward neural networks with multiple layers to model complex non-linear functions and classify high-dimensional data. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch12))
- [Hyperparameter Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/model-fine-tuning-resources/hyperparameter-tuning.md) — Guides the optimization of model parameters using techniques like k-fold cross-validation to minimize generalization error. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/evaluate-a-model.md))
- [K-Fold Cross-Validation](https://awesome-repositories.com/f/artificial-intelligence-ml/model-validation-tools/cross-validation-utilities/k-fold-cross-validation.md) — Implements k-fold cross-validation to estimate the generalization performance of predictive models.
- [Text Vectorizations](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-processing/text-vectorizations.md) — Converts text documents into numerical feature vectors using techniques like TF-IDF and bag-of-words. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/bag-of-words-sparsity.md))
- [Neural Network Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-implementations.md) — Provides low-level implementations of artificial neural networks with multiple hidden layers built from scratch. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/difference-deep-and-normal-learning.md))
- [Neural Network Training](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-training.md) — Implements the process of constructing and training neural networks for pattern recognition tasks like image identification. ([source](https://github.com/rasbt/python-machine-learning-book#readme))
- [Gradient Descent Algorithms](https://awesome-repositories.com/f/artificial-intelligence-ml/optimization-algorithms/gradient-descent-algorithms.md) — Implements iterative optimization algorithms that update model weights by following the negative gradient. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/linear-gradient-derivative.md))
- [Example Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/python-machine-learning-libraries/example-implementations.md) — Offers a wide collection of Python-based implementations for supervised and unsupervised learning algorithms.
- [Similarity-Based Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/similarity-based-clustering.md) — Implements various similarity-based grouping methods including centroid and density-based clustering. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch11))
- [Supervised Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/supervised-learning.md) — Provides a comprehensive guide to training models on labeled data for classification and regression tasks. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch01))
- [Categorical Encodings](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-numeric-transformations/categorical-encodings.md) — Transforms categorical features into numerical formats using mapping and one-hot encoding for model compatibility. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch04))
- [Unsupervised Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/unsupervised-learning.md) — Covers algorithms for discovering hidden patterns and structures in unlabeled datasets through clustering and dimensionality reduction. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch01))
- [Anomaly Detection](https://awesome-repositories.com/f/artificial-intelligence-ml/anomaly-detection.md) — Provides algorithms for identifying outliers and rare events that deviate from the norm. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/datamining-overview.md))
- [Association Rule Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/association-rule-learning.md) — Demonstrates association rule learning to uncover frequent relationships and co-occurrences in large datasets. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/datamining-overview.md))
- [Embedding-Based Feature Selection](https://awesome-repositories.com/f/artificial-intelligence-ml/automated-feature-selection-tools/embedding-based-feature-selection.md) — Performs feature selection during training by incorporating penalties like L1 regularization to induce sparsity in parameters. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/feature_sele_categories.md))
- [Filter-Based Feature Selection](https://awesome-repositories.com/f/artificial-intelligence-ml/automated-feature-selection-tools/filter-based-feature-selection.md) — Identifies useful data attributes by calculating statistical measures like variance or correlation independently of the learning model. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/feature_sele_categories.md))
- [Hyperparameter Evaluation Loops](https://awesome-repositories.com/f/artificial-intelligence-ml/autonomous-agent-loops/research-quality-refinement-loops/evaluator-optimizer-loops/hyperparameter-evaluation-loops.md) — Implements nested-loop model selection to optimize hyperparameters while preventing data leakage.
- [Bias and Variance Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/bias-and-variance-analysis.md) — Analyzes the trade-off between model bias and variance using learning and validation curves to improve generalization. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch06))
- [Class Probability Estimation](https://awesome-repositories.com/f/artificial-intelligence-ml/class-probability-estimation.md) — Implements class probability estimation using sigmoid functions to determine the likelihood of class membership. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/logistic-why-sigmoid.md))
- [Classifier Accuracy Metrics](https://awesome-repositories.com/f/artificial-intelligence-ml/classifier-accuracy-metrics.md) — Calculates precision, recall, and F1 scores to evaluate the accuracy and balance of classifiers. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/computing-the-f1-score.md))
- [Decision Boundary Visualizations](https://awesome-repositories.com/f/artificial-intelligence-ml/decision-boundary-visualizations.md) — Implements visualizations of the spatial boundaries where a classifier changes its prediction to illustrate feature space partitioning. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/naive-bayes-boundary.md))
- [Decision Trees](https://awesome-repositories.com/f/artificial-intelligence-ml/decision-trees.md) — Provides implementations of decision trees that maximize information gain using Gini impurity or entropy. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/decision-tree-binary.md))
- [Bagging Ensembles](https://awesome-repositories.com/f/artificial-intelligence-ml/ensemble-learning/bagging-ensembles.md) — Implements bagging ensembles that train base models on random bootstrap samples to reduce variance. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch07))
- [Classifier Stacking](https://awesome-repositories.com/f/artificial-intelligence-ml/ensemble-learning/classifier-stacking.md) — Trains a meta-classifier to learn the optimal weights for combining multiple models. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/logistic-boosting.md))
- [Incremental Model Updating](https://awesome-repositories.com/f/artificial-intelligence-ml/incremental-updates/incremental-model-updating.md) — Implements online learning techniques that update model weights incrementally using mini-batches of data. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/choosing-technique.md))
- [Input Variable Standardization](https://awesome-repositories.com/f/artificial-intelligence-ml/input-variable-standardization.md) — Centers variables at a mean of zero and scales them to a unit standard deviation to optimize gradient descent. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/pearson-r-vs-linear-regr.md))
- [Cluster Count Selection Methods](https://awesome-repositories.com/f/artificial-intelligence-ml/k-means-clustering/cluster-count-selection-methods.md) — Details methods for determining the optimal number of clusters using elbow plots and silhouette analysis. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch11))
- [K-Nearest Neighbor Classifiers](https://awesome-repositories.com/f/artificial-intelligence-ml/k-nearest-neighbor-classifiers.md) — Implements k-nearest neighbor classifiers that assign classes based on the majority vote of neighboring samples. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/lazy-knn.md))
- [Linear Classifiers](https://awesome-repositories.com/f/artificial-intelligence-ml/linear-classifiers.md) — Implements linear classifiers that separate data classes using weight matrices and bias vectors. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/classifier-history.md))
- [Linear Regression](https://awesome-repositories.com/f/artificial-intelligence-ml/linear-regression.md) — Provides implementations of linear regression for predicting continuous target variables. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch10))
- [Logistic Regression Models](https://awesome-repositories.com/f/artificial-intelligence-ml/logistic-regression-models.md) — Implements logistic regression models to predict binary outcomes using the sigmoid function. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/logistic_regression_linear.md))
- [Machine Learning Classification](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning-classification.md) — Provides examples of training models to assign predefined labels to data points through classification. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch13))
- [Regularization Techniques](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/algorithms/regression-models/regularization-techniques.md) — Demonstrates regularization techniques, such as penalty terms, to prevent overfitting in regression models. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/regularized-logistic-regression-performance.md))
- [Boosting Algorithms](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/machine-learning-training/boosting-algorithms.md) — Provides implementations of boosting algorithms like AdaBoost that iteratively improve model bias. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/bagging-boosting-rf.md))
- [Mathematical Training Objectives](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/machine-learning-training/objectives-and-optimization/mathematical-training-objectives.md) — Provides implementations of mathematical targets and cost functions used to train various machine learning models. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/cost-vs-loss.md))
- [Machine Learning Model APIs](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-deployment-and-serving/inference-servers-and-runtimes/machine-learning-model-apis.md) — Provides practical guidance on wrapping trained machine learning models in web APIs for production deployment. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch09))
- [Training Set Size Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-evaluation-analysis/machine-learning-evaluation/visual-model-evaluators/training-set-size-analysis.md) — Plots learning curves of accuracies against training set size to determine if more data improves performance. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/ml-solvable.md))
- [Model Deployment](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/fine-tuning-and-customization/model-fine-tuning/pre-trained-model-zoos/model-deployment.md) — Provides guidance on wrapping trained models in lightweight web servers for real-time prediction interfaces. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/bonus))
- [Learning Rate Schedulers](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/model-fine-tuning-adaptation/learning-rate-schedulers.md) — Demonstrates how to adjust learning rates during training using adaptive decay and momentum to optimize convergence. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/neuralnet-error.md))
- [Model Behavior Visualizations](https://awesome-repositories.com/f/artificial-intelligence-ml/model-behavior-visualizations.md) — Provides visual tools to illustrate how different machine learning models partition feature spaces and separate data classes. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/clf-behavior-data.md))
- [Overfitting Debuggers](https://awesome-repositories.com/f/artificial-intelligence-ml/model-debugging-utilities/overfitting-debuggers.md) — Implements tools for detecting generalization gaps by comparing training and test performance. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/overfitting.md))
- [Grid Search Executors](https://awesome-repositories.com/f/artificial-intelligence-ml/model-fine-tuning-resources/hyperparameter-tuning/grid-search-executors.md) — Provides implementations for exhaustive grid search to identify optimal hyperparameter configurations for models. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch06))
- [Model Generalization](https://awesome-repositories.com/f/artificial-intelligence-ml/model-generalization.md) — Implements techniques like k-fold cross-validation to ensure models perform reliably on unseen data. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch06))
- [Model Performance Evaluators](https://awesome-repositories.com/f/artificial-intelligence-ml/model-performance-evaluators.md) — Provides tools for quantifying the accuracy and reliability of multi-class models using various averaging strategies. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/multiclass-metric.md))
- [Ensemble Prediction Combinations](https://awesome-repositories.com/f/artificial-intelligence-ml/model-predictions/ensemble-prediction-combinations.md) — Aggregates predictions from several different models into a single ensemble vote to improve accuracy. ([source](http://rasbt.github.io/mlxtend/))
- [Naive Bayes Classifiers](https://awesome-repositories.com/f/artificial-intelligence-ml/naive-bayes-classifiers.md) — Implements Naive Bayes classifiers based on class frequencies and the assumption of feature independence. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/naive-naive-bayes.md))
- [Term Frequency Analyzers](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-processing-tools/term-frequency-analyzers.md) — Computes the statistical importance of words using term frequency-inverse document frequency. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch08))
- [Neural Network Optimizers](https://awesome-repositories.com/f/artificial-intelligence-ml/optimization-algorithms/neural-network-optimizers.md) — Implements gradient-based optimization algorithms like SGD and Adam to minimize error and achieve network convergence. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch12))
- [Perceptron Classifiers](https://awesome-repositories.com/f/artificial-intelligence-ml/perceptron-classifiers.md) — Provides a practical implementation of the perceptron for binary linear classification. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch02))
- [Random Forest Ensembles](https://awesome-repositories.com/f/artificial-intelligence-ml/random-forest-ensembles.md) — Implements random forests by combining bootstrap sampling with random feature selection. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/bagging-boosting-rf.md))
- [Regression Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/regression-analysis.md) — Provides implementations and examples of statistical regression methods for modeling relationships between variables. ([source](https://github.com/rasbt/python-machine-learning-book#readme))
- [Regression Scoring Evaluation](https://awesome-repositories.com/f/artificial-intelligence-ml/regression-scoring-evaluation.md) — Implements statistical metrics and distribution plots to evaluate the accuracy of continuous numerical predictions. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch10))
- [Reinforcement Learning Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/reinforcement-learning-implementations.md) — Provides practical implementations of reinforcement learning agents that make decisions based on reward signals. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch01))
- [Specialized Network Architectures](https://awesome-repositories.com/f/artificial-intelligence-ml/specialized-network-architectures.md) — Constructs advanced network types including convolutional networks for spatial data and recurrent networks for sequential data. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch12))
- [Stochastic Gradient Descent](https://awesome-repositories.com/f/artificial-intelligence-ml/stochastic-gradient-descent.md) — Implements stochastic gradient descent for efficient model weight updates during large-scale training. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch02))
- [Linear Discriminant Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/supervised-classification/dimensionality-reduction/linear-discriminant-analysis.md) — Implements Linear Discriminant Analysis to reduce dimensions by maximizing the distance between multiple classes. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch05))
- [Kernel-Based Classifiers](https://awesome-repositories.com/f/artificial-intelligence-ml/supervised-classification/kernel-based-classifiers.md) — Implements non-linear classification using kernel tricks to find separating hyperplanes for non-linearly separable data. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch03))
- [Support Vector Machines](https://awesome-repositories.com/f/artificial-intelligence-ml/support-vector-machines.md) — Implements support vector machines using linear and radial basis function kernels for data classification. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/num-support-vectors.md))
- [Training Curve Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/training-curve-analysis.md) — Provides methods for interpreting training and validation curves to diagnose model convergence and behavior. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/avoid-overfitting.md))
- [Training Sample Shuffling](https://awesome-repositories.com/f/artificial-intelligence-ml/training-data-sampling-strategies/training-sample-shuffling.md) — Implements training sample shuffling to prevent cyclic patterns during neural network training. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/neuralnet-error.md))
- [Minimal Feature Subset Selectors](https://awesome-repositories.com/f/artificial-intelligence-ml/weighted-feature-selection/minimal-feature-subset-selectors.md) — Implements sequential selection algorithms to find optimal attribute subsets by iteratively evaluating model performance. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/feature_sele_categories.md))

### Data & Databases

- [Streaming Preprocessing Pipelines](https://awesome-repositories.com/f/data-databases/data-preprocessing-pipelines/streaming-preprocessing-pipelines.md) — Provides implementations of pipelines that sequence data preprocessing and estimator steps into a single workflow.
- [Text Preprocessing](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-transformation/text-nlp-preprocessing/text-preprocessing.md) — Cleans raw text and performs tokenization to prepare documents for feature extraction. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch08))
- [Missing Data Removal](https://awesome-repositories.com/f/data-databases/missing-data-removal.md) — Implements techniques for filtering out rows or columns containing missing values when data volume is sufficient. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/missing-data.md))
- [Missing Value Imputation](https://awesome-repositories.com/f/data-databases/missing-value-imputation.md) — Estimates placeholder values for missing data using global statistics or k-nearest neighbors. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/missing-data.md))
- [Dimensionality Reduction](https://awesome-repositories.com/f/data-databases/vector-quantization/high-dimensional-vector-compressors/dimensionality-reduction.md) — Implements unsupervised techniques like PCA to capture the most significant variance in high-dimensional datasets. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch05))
- [Dendrogram Visualizations](https://awesome-repositories.com/f/data-databases/cluster-visualizations/dendrogram-visualizations.md) — Visualizes cluster hierarchies using dendrograms to analyze nested groupings and distances. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch11))
- [One-Vs-All Multi-class Classification](https://awesome-repositories.com/f/data-databases/data-categorization/classification-labelers/one-vs-all-multi-class-classification.md) — Implements multi-class classification strategies, including one-vs-all and softmax regression. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/logisticregr-neuralnet.md))
- [Model State Serializers](https://awesome-repositories.com/f/data-databases/data-export-utilities/json-serializers/model-state-serializers.md) — Includes examples for exporting trained model parameters to JSON format for persistence and transport.
- [Parametric Model Fitters](https://awesome-repositories.com/f/data-databases/model-as-a-table-integrations/model-fitting-via-api/parametric-model-fitters.md) — Implements parametric models that fit data using a fixed set of parameters, such as linear regression. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/parametric_vs_nonparametric.md))
- [Model Persistence Formats](https://awesome-repositories.com/f/data-databases/model-persistence-formats.md) — Serializes trained model parameters into JSON format for saving and reloading without binary formats. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/bonus))
- [Out-of-Core Processing](https://awesome-repositories.com/f/data-databases/out-of-core-processing.md) — Demonstrates out-of-core learning for training models on datasets that exceed available system memory.
- [Nonparametric Models](https://awesome-repositories.com/f/data-databases/tabular-data-frameworks/tabular-predictive-models/nonparametric-models.md) — Builds nonparametric models where complexity grows dynamically with the size of the training dataset. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/parametric_vs_nonparametric.md))

### Scientific & Mathematical Computing

- [Principal Component Analysis](https://awesome-repositories.com/f/scientific-mathematical-computing/linear-algebra-routines/principal-component-analysis.md) — Implements Principal Component Analysis by calculating eigenvectors and eigenvalues of a covariance matrix. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/pca-scaling.md))

### Part of an Awesome List

- [Out-of-Core Data Processing](https://awesome-repositories.com/f/awesome-lists/data/data-processing-and-analysis/out-of-core-data-processing.md) — Implements out-of-core learning techniques to process datasets that exceed system memory. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/faq/choosing-technique.md))
- [Learning and Reference](https://awesome-repositories.com/f/awesome-lists/ai/learning-and-reference.md) — Python machine learning book code.
- [Machine Learning and AI](https://awesome-repositories.com/f/awesome-lists/ai/machine-learning-and-ai.md) — Code and examples from a comprehensive machine learning textbook.
- [Practical Learning Resources](https://awesome-repositories.com/f/awesome-lists/ai/practical-learning-resources.md) — Comprehensive guide to machine learning using Python libraries.

### Programming Languages & Runtimes

- [Model Serialization](https://awesome-repositories.com/f/programming-languages-runtimes/json-serialization/model-serialization.md) — Converts machine learning model architectures and weights into portable JSON formats for distribution. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/bonus/scikit-model-to-json.ipynb))

### Testing & Quality Assurance

- [Classification Accuracy Scorers](https://awesome-repositories.com/f/testing-quality-assurance/model-evaluation-benchmarks/correction-accuracy-evaluators/classification-accuracy-scorers.md) — Uses confusion matrices and precision-recall metrics to measure the accuracy of classification models. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch06))

### User Interface & Experience

- [Feature Relevance Scorers](https://awesome-repositories.com/f/user-interface-experience/search-result-ranking/relevance-scoring/feature-relevance-scorers.md) — Computes statistical measures like importance scores and L1 regularization to identify the most relevant predictors. ([source](https://github.com/rasbt/python-machine-learning-book/blob/master/code/ch04))
