# rasbt/python-machine-learning-book-2nd-edition

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/rasbt-python-machine-learning-book-2nd-edition).**

7,194 stars · 2,799 forks · Jupyter Notebook · MIT

## Links

- GitHub: https://github.com/rasbt/python-machine-learning-book-2nd-edition
- awesome-repositories: https://awesome-repositories.com/repository/rasbt-python-machine-learning-book-2nd-edition.md

## Topics

`data-science` `deep-learning` `machine-learning` `python` `scikit-learn` `tensorflow`

## Description

This project is a machine learning educational resource and implementation guide for Python. It provides a collection of executable code and notebooks that demonstrate predictive modeling, data analysis workflows, and the implementation of various machine learning algorithms.

The repository features practical examples of classification, regression, and clustering tasks using Scikit-Learn, alongside tutorials for building and training deep learning architectures with TensorFlow. These include implementations of convolutional and recurrent networks.

The content covers a broad range of capabilities, including data preprocessing for cleaning and scaling, feature engineering, and model evaluation using classification metrics and hyperparameter optimization. It also includes guidance on unsupervised learning techniques and the deployment of models within web applications.

The materials are provided primarily as Jupyter Notebooks.

## Tags

### Artificial Intelligence & ML

- [Machine Learning Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning-implementations.md) — Provides a comprehensive collection of code-based implementations for core machine learning algorithms and predictive models. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch01/ch01.ipynb))
- [Predictive Modeling Workflows](https://awesome-repositories.com/f/artificial-intelligence-ml/predictive-modeling-workflows.md) — Provides a comprehensive implementation guide for building and testing various predictive machine learning models. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch15/ch15.ipynb))
- [Bias and Variance Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/bias-and-variance-analysis.md) — Uses learning and validation curves to diagnose over-fitting and under-fitting through bias-variance analysis. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch06))
- [Classification Metrics](https://awesome-repositories.com/f/artificial-intelligence-ml/classification-metrics.md) — Provides tools for calculating precision, recall, and confusion matrices to analyze classification accuracy. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch06))
- [Classification Models](https://awesome-repositories.com/f/artificial-intelligence-ml/classification-models.md) — Implements various algorithms for predicting categorical outcomes, including perceptrons, logistic regression, and support vector machines. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch03))
- [Data Preprocessing](https://awesome-repositories.com/f/artificial-intelligence-ml/data-preprocessing.md) — Provides workflows for cleaning, scaling, and encoding raw datasets to prepare them for machine learning.
- [Custom Network Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-learning-framework-implementations/custom-network-implementations.md) — Demonstrates how to construct neural, convolutional, and recurrent networks using both custom code and frameworks. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/README.md))
- [Machine Learning Workflow Libraries](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning-workflow-libraries.md) — Demonstrates complete machine learning workflows and pipelines using Jupyter Notebooks and Python scripts. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition#readme))
- [Linear Regression Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/algorithms/linear-regression-implementations.md) — Provides educational implementations of linear regression models built from scratch. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch10))
- [Regression Models](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/algorithms/regression-models.md) — Implements algorithms for predicting continuous numerical values based on historical data patterns. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch14))
- [Neural Network Layers](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/model-construction/neural-network-layers.md) — Provides building blocks for constructing deep learning models using dense, convolutional, and recurrent layers.
- [Neural Networks](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-networks.md) — Constructs multilayer neural networks to process complex data patterns using both high-level and low-level APIs. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch13))
- [Training and Testing Splits](https://awesome-repositories.com/f/artificial-intelligence-ml/training-and-testing-splits.md) — Implements techniques for splitting datasets into training and test sets to evaluate generalization on unseen data. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch04))
- [Training Dataset Processing](https://awesome-repositories.com/f/artificial-intelligence-ml/training-dataset-processing.md) — Executes cleaning and dimensionality reduction workflows to prepare raw datasets for model training. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/README.md))
- [Activation Functions](https://awesome-repositories.com/f/artificial-intelligence-ml/activation-functions/gated-linear-units/activation-functions.md) — Implements mathematical transformations to introduce non-linearity and control signal flow in neural networks. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch13))
- [Automated Feature Selection Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/automated-feature-selection-tools.md) — Includes methods for identifying the most relevant input variables to reduce model complexity. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch04))
- [Class Imbalance Handling](https://awesome-repositories.com/f/artificial-intelligence-ml/class-imbalance-handling.md) — Provides techniques for adjusting class distributions to prevent model bias on imbalanced datasets. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch06))
- [Hierarchical Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/clustering-algorithms/hierarchical-clustering.md) — Builds hierarchical clusters using agglomerative methods and visualizes them through dendrograms. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch11))
- [Non-linear Problem Solvers](https://awesome-repositories.com/f/artificial-intelligence-ml/complex-problem-solving/non-linear-problem-solvers.md) — Implements kernel methods to find separating hyperplanes for complex, non-linearly separable datasets. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch03))
- [Convolutional Neural Networks](https://awesome-repositories.com/f/artificial-intelligence-ml/convolutional-neural-networks.md) — Implements convolutional neural network architectures for processing structured grid data and images. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch15))
- [Dataset Distribution Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/dataset-quality-analyzers/dataset-distribution-analysis.md) — Provides visual and statistical analysis of feature distributions and correlation matrices. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch10))
- [Ensemble Methods](https://awesome-repositories.com/f/artificial-intelligence-ml/decision-trees/ensemble-methods.md) — Demonstrates the use of random forests and other ensemble methods to improve predictive accuracy. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch03))
- [Density-Based Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/density-based-clustering.md) — Implements DBSCAN to identify clusters of varying shapes based on local point density. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch11))
- [Distance-Based Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/distance-based-clustering.md) — Provides implementations of distance-based clustering algorithms including k-means and DBSCAN.
- [Ensemble Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/ensemble-learning.md) — Provides implementations for combining multiple models through bagging and majority voting to improve predictive stability. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch07))
- [Bagging Ensembles](https://awesome-repositories.com/f/artificial-intelligence-ml/ensemble-learning/bagging-ensembles.md) — Implements bagging ensembles to reduce model variance by training on different random data samples.
- [Feature Scale Normalization](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-scale-normalization.md) — Scales numeric features to a standard range to ensure stable and consistent model convergence. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch04))
- [Image Data Preprocessing](https://awesome-repositories.com/f/artificial-intelligence-ml/image-data-preprocessing.md) — Implements preprocessing for image files, including channel management and tensor preparation. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch15))
- [Iterative Parameter Optimizations](https://awesome-repositories.com/f/artificial-intelligence-ml/iterative-parameter-optimizations.md) — Demonstrates the use of gradient descent to iteratively update model weights and minimize cost functions. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch02))
- [K-Means Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/k-means-clustering.md) — Implements k-means and k-means++ algorithms to group data points based on proximity. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch11))
- [Cluster Count Selection Methods](https://awesome-repositories.com/f/artificial-intelligence-ml/k-means-clustering/cluster-count-selection-methods.md) — Demonstrates the elbow method and silhouette plots to select the optimal number of clusters. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch11))
- [K-Nearest Neighbor Classifiers](https://awesome-repositories.com/f/artificial-intelligence-ml/k-nearest-neighbor-classifiers.md) — Implements K-nearest neighbor algorithms to classify data points based on sample proximity. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch03))
- [Kernel-Based Feature Mapping](https://awesome-repositories.com/f/artificial-intelligence-ml/kernel-based-feature-mapping.md) — Implements kernel functions to project data into higher-dimensional spaces for non-linear classification.
- [Kernel Mappings](https://awesome-repositories.com/f/artificial-intelligence-ml/linear-regression/polynomial-feature-mapping/kernel-mappings.md) — Applies kernel functions to resolve non-linear separation by projecting data into higher-dimensional spaces. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch05))
- [Machine Learning Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning-pipelines.md) — Automates the sequence of data transformation and model estimation into a single predictive workflow.
- [Regression Ensembles](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/algorithms/predictive-machine-learning-analytics/regression-ensembles.md) — Uses polynomial regression and ensemble methods to capture complex non-linear relationships in data. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch10))
- [Regularization Techniques](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/algorithms/regression-models/regularization-techniques.md) — Implements penalty terms in regression models to prevent overfitting and improve generalization. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch10))
- [Boosting Algorithms](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/machine-learning-training/boosting-algorithms.md) — Implements AdaBoost and other boosting ensemble methods to iteratively correct errors and build strong classifiers. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch07))
- [Incremental Training](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/machine-learning-training/incremental-training.md) — Implements incremental weight updates using stochastic gradient descent to handle large-scale training datasets. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch08))
- [Grid Search Executors](https://awesome-repositories.com/f/artificial-intelligence-ml/model-fine-tuning-resources/hyperparameter-tuning/grid-search-executors.md) — Demonstrates exhaustive grid search over hyperparameter spaces to identify optimal model configurations.
- [Model Generalization](https://awesome-repositories.com/f/artificial-intelligence-ml/model-generalization.md) — Provides techniques and examples for ensuring models perform reliably on unseen data via regularization and slack variables. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch03))
- [Hyperparameter Optimization](https://awesome-repositories.com/f/artificial-intelligence-ml/model-optimization/training-efficiency/hyperparameter-optimization.md) — Provides examples of grid search and nested cross-validation to find optimal model parameters. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch06))
- [Multilayer Perceptrons](https://awesome-repositories.com/f/artificial-intelligence-ml/multilayer-perceptrons.md) — Builds and trains multilayer perceptrons using backpropagation to model complex non-linear functions. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch12))
- [Neural Network Regularization](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-regularization.md) — Implements dropout layers to prevent overfitting and improve model generalization in deep learning networks. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch15))
- [Perceptrons](https://awesome-repositories.com/f/artificial-intelligence-ml/perceptrons.md) — Builds linear binary classifiers that use iterative error correction to find decision boundaries. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch02))
- [Prediction Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/prediction-pipelines.md) — Combines data transformers and estimators into pipelines to streamline predictions from raw data. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch06))
- [Character-Level Language Models](https://awesome-repositories.com/f/artificial-intelligence-ml/recurrent-neural-networks/character-level-language-models.md) — Provides a practical implementation of a recurrent neural network trained for character-level text prediction. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch16))
- [Robust Regression](https://awesome-repositories.com/f/artificial-intelligence-ml/regression-analysis/robust-regression.md) — Implements robust regression techniques like RANSAC to minimize the impact of outliers on parameters. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch10))
- [Sequential Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/sequential-learning.md) — Implements recurrent neural networks to process and model ordered sequences such as text. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch16))
- [Stochastic Gradient Descent](https://awesome-repositories.com/f/artificial-intelligence-ml/stochastic-gradient-descent.md) — Implements stochastic gradient descent to update model parameters incrementally using random data samples.
- [Dimensionality Reduction](https://awesome-repositories.com/f/artificial-intelligence-ml/supervised-classification/dimensionality-reduction.md) — Projects data into a lower-dimensional space to maximize class separability and improve classification. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch05))
- [Categorical Encodings](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-numeric-transformations/categorical-encodings.md) — Transforms nominal and ordinal categorical features into numerical formats using one-hot encoding. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch04))

### Education & Learning Resources

- [Machine Learning Educational Resources](https://awesome-repositories.com/f/education-learning-resources/machine-learning-educational-resources.md) — Serves as a technical companion for learning predictive modeling through executable code and data analysis notebooks.
- [Model Evaluation Techniques](https://awesome-repositories.com/f/education-learning-resources/model-evaluation-techniques.md) — Demonstrates various methods for assessing model performance and improving accuracy through hyperparameter optimization. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch06))
- [Neural Network Tutorials](https://awesome-repositories.com/f/education-learning-resources/neural-network-tutorials.md) — Provides instructional content and code for building convolutional and recurrent neural networks with TensorFlow.
- [Scikit-Learn Examples](https://awesome-repositories.com/f/education-learning-resources/supervised-learning-examples/scikit-learn-examples.md) — Offers practical implementations of classification, regression, and clustering tasks specifically using Scikit-Learn.
- [Sentiment Analysis Models](https://awesome-repositories.com/f/education-learning-resources/educational-resources/systems-applied-computing/machine-learning-education/sentiment-analysis-models.md) — Implements neural network architectures specifically designed to classify the emotional tone of text data. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch16))

### Data & Databases

- [Missing Value Imputation](https://awesome-repositories.com/f/data-databases/missing-value-imputation.md) — Implements techniques for resolving missing tabular data through removal or statistical imputation. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch04))
- [Model State Persistence](https://awesome-repositories.com/f/data-databases/model-state-persistence.md) — Provides methods for serializing fitted estimators to disk and reloading them for deployment. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch09))
- [Dimensionality Reduction](https://awesome-repositories.com/f/data-databases/vector-quantization/high-dimensional-vector-compressors/dimensionality-reduction.md) — Transforms high-dimensional data into orthogonal components to capture maximum variance. ([source](https://github.com/rasbt/python-machine-learning-book-2nd-edition/blob/master/code/ch05))
