# rasbt/machine-learning-book

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/rasbt-machine-learning-book).**

5,239 stars · 1,822 forks · Jupyter Notebook · MIT

## Links

- GitHub: https://github.com/rasbt/machine-learning-book
- Homepage: https://sebastianraschka.com/books/#machine-learning-with-pytorch-and-scikit-learn
- awesome-repositories: https://awesome-repositories.com/repository/rasbt-machine-learning-book.md

## Description

This project is a comprehensive machine learning educational resource and tutorial series delivered as a collection of interactive Jupyter Notebooks. It provides practical Python implementations for the end-to-end machine learning lifecycle, covering supervised and unsupervised learning, deep learning, and reinforcement learning.

The resource distinguishes itself by providing detailed implementation guides for complex architectures, including transformers, generative adversarial networks, and convolutional neural networks. It also features specialized courseware for developing reinforcement learning agents using Q-learning and Deep Q-Networks within simulated environments.

The content covers a broad surface of data science capabilities, including data engineering pipelines, feature encoding, and dimensionality reduction. It provides extensive material on model evaluation through cross-validation and diagnostic metrics, as well as advanced topics like natural language processing, sentiment analysis, and generative AI.

The entire curriculum is designed for interactive execution within Jupyter Notebooks, combining executable code, rich text, and visualizations.

## Tags

### Education & Learning Resources

- [Interactive Notebooks](https://awesome-repositories.com/f/education-learning-resources/interactive-notebooks.md) — Delivers an entire curriculum through interactive notebooks that combine executable code, narrative text, and visualizations.
- [Jupyter Notebook Curricula](https://awesome-repositories.com/f/education-learning-resources/jupyter-notebook-curricula.md) — Delivers a comprehensive machine learning curriculum as interactive Jupyter Notebooks combining code, text, and visualizations.
- [AI & Machine Learning Education](https://awesome-repositories.com/f/education-learning-resources/technical-domain-education/ai-machine-learning-education.md) — Provides a comprehensive curriculum of interactive tutorials and Python implementations for AI and machine learning education.
- [Data Science Tutorials](https://awesome-repositories.com/f/education-learning-resources/data-science-tutorials.md) — Offers step-by-step tutorials covering data preprocessing, feature engineering, and model evaluation.
- [Workflow Tutorials](https://awesome-repositories.com/f/education-learning-resources/data-science-workflow-references/workflow-tutorials.md) — Provides practical guides for cleaning raw data, handling missing values, and performing dimensionality reduction.
- [Machine Learning Educational Resources](https://awesome-repositories.com/f/education-learning-resources/machine-learning-educational-resources.md) — Provides a comprehensive collection of interactive notebooks and annotated code for learning data science and AI workflows.
- [Supervised Learning Tutorials](https://awesome-repositories.com/f/education-learning-resources/technical-domain-education/ai-machine-learning-education/machine-learning-fundamentals/linear-regression-tutorials/supervised-learning-tutorials.md) — Implements practical tutorials for supervised learning, including classification and regression algorithms.

### Artificial Intelligence & ML

- [Classification Metrics](https://awesome-repositories.com/f/artificial-intelligence-ml/classification-metrics.md) — Evaluates classification model performance using precision, recall, confusion matrices, and ROC curves. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch06))
- [Deep Learning Architectures](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-learning-architectures.md) — Guides the construction of multi-layered neural networks, including convolutional, recurrent, and graph-based architectures. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Deep Q-Learning Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-q-learning-implementations.md) — Implements deep Q-learning algorithms using neural networks, experience replay, and target value computations. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch19))
- [Educational Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-q-learning-implementations/deep-reinforcement-learning-implementations/educational-implementations.md) — Develops intelligent agents through practical implementations of Q-learning and Deep Q-Networks in simulated environments.
- [Educational Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-q-learning-implementations/parallel-q-learning-implementations/educational-implementations.md) — Provides hands-on implementations of Q-learning and Deep Q-Networks for learning agent-environment interactions.
- [RL Courseware](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-q-learning-implementations/rl-courseware.md) — Ships specialized courseware for implementing reinforcement learning agents and decision-making algorithms.
- [Ensemble Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/ensemble-learning.md) — Implements techniques that combine multiple models to improve predictive performance and robustness. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Supervised Learning Models](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/algorithms/core-algorithmic-paradigms/supervised-learning-models.md) — Provides algorithms for classification and regression that predict target variables based on labeled training data. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Unsupervised Learning Algorithms](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/algorithms/core-algorithmic-paradigms/unsupervised-learning-algorithms.md) — Uses clustering and dimensionality reduction to identify hidden patterns within unlabeled datasets. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Deep Learning Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/architectures/neural-network-components/deep-learning-implementations.md) — Provides manual implementations of complex neural network architectures like transformers and GANs from first principles.
- [Model Evaluation and Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/model-evaluation-and-tuning.md) — Provides workflows for measuring model accuracy and tuning hyperparameters to improve prediction precision. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Model Evaluation Metrics](https://awesome-repositories.com/f/artificial-intelligence-ml/model-evaluation-metrics.md) — Implements detailed workflows for assessing model predictive accuracy using cross-validation and various classification metrics.
- [K-Fold Cross-Validation](https://awesome-repositories.com/f/artificial-intelligence-ml/model-validation-tools/cross-validation-utilities/k-fold-cross-validation.md) — Implements k-fold cross-validation to partition data for evaluating model generalizability and tuning hyperparameters.
- [Neural Network Construction](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-construction.md) — Teaches the process of designing and building deep learning architectures using layers and model abstractions. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch12))
- [Neural Network Implementation Guides](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-implementation-guides.md) — Provides instructional guides for building complex neural network architectures, including transformers and GANs.
- [Reinforcement Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/reinforcement-learning.md) — Develops agents that learn optimal decision-making policies using Q-learning and Deep Q-Networks. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Agent Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/reinforcement-learning/agent-implementations.md) — Develops agent-environment interfaces using Q-learning and SARSA to optimize decision-making policies. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch19))
- [Supervised Classification](https://awesome-repositories.com/f/artificial-intelligence-ml/supervised-classification.md) — Implements supervised classification algorithms including perceptrons and logistic regression to categorize data. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch03))
- [Supervised Metric Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/supervised-learning/supervised-metric-learning.md) — Implements models that learn mappings from input data to known targets using labeled datasets. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Unsupervised Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/unsupervised-learning.md) — Implements models that discover hidden patterns or intrinsic structures within unlabeled data. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Attention Mechanisms](https://awesome-repositories.com/f/artificial-intelligence-ml/attention-mechanisms.md) — Provides implementations of attention layers to calculate weighted relevance of input segments in neural networks.
- [Automatic Differentiation](https://awesome-repositories.com/f/artificial-intelligence-ml/automatic-differentiation.md) — Implements automatic differentiation to calculate loss function gradients for updating model parameters.
- [Automatic Differentiation Engines](https://awesome-repositories.com/f/artificial-intelligence-ml/automatic-differentiation-engines.md) — Implements systems that compute gradients of mathematical functions through computational graphs for neural network training.
- [Boosting Algorithms](https://awesome-repositories.com/f/artificial-intelligence-ml/boosting-algorithms.md) — Implements sequential ensemble methods where each new model corrects the errors of its predecessors. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch07))
- [Class Imbalance Handling](https://awesome-repositories.com/f/artificial-intelligence-ml/class-imbalance-handling.md) — Implements techniques like per-class weighting to improve model performance on imbalanced datasets. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch06))
- [Hierarchical Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/clustering-algorithms/hierarchical-clustering.md) — Organizes data into tree-like structures using agglomerative clustering and dendrogram visualizations. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch10))
- [Convolutional Neural Networks](https://awesome-repositories.com/f/artificial-intelligence-ml/convolutional-neural-networks.md) — Implements convolutional neural networks designed for processing structured grid data and image analysis. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch14))
- [Dataset Splitting Utilities](https://awesome-repositories.com/f/artificial-intelligence-ml/dataset-management/evaluation-datasets/dataset-splitting-utilities.md) — Provides utilities for dividing datasets into distinct training and testing sets to evaluate model performance. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch04))
- [Density-Based Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/density-based-clustering.md) — Identifies dense data clusters and filters noise using the DBSCAN algorithm. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch10))
- [Encoder-Decoder Transformers](https://awesome-repositories.com/f/artificial-intelligence-ml/encoder-decoder-transformers.md) — Implements transformer architectures using encoder-decoder structures for processing and generating sequential information. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch16))
- [AdaBoost Classifiers](https://awesome-repositories.com/f/artificial-intelligence-ml/ensemble-learning/adaboost-classifiers.md) — Provides implementations of the AdaBoost algorithm to aggregate weak learners into a strong classifier. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch07))
- [Bagging Ensembles](https://awesome-repositories.com/f/artificial-intelligence-ml/ensemble-learning/bagging-ensembles.md) — Implements bagging ensembles that fit base models across different data splits to reduce variance. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch07))
- [Ensemble Methods](https://awesome-repositories.com/f/artificial-intelligence-ml/ensemble-methods.md) — Builds classifiers that combine multiple individual models through voting, bagging, or boosting. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch07))
- [Experience Replay Buffers](https://awesome-repositories.com/f/artificial-intelligence-ml/experience-replay-buffers.md) — Implements experience replay buffers to store agent transitions and stabilize reinforcement learning training.
- [Feature Scale Normalization](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-scale-normalization.md) — Applies preprocessing techniques to scale numeric features to a standard range for stable model convergence. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch04))
- [Generative Adversarial Networks](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-adversarial-networks.md) — Implements generative adversarial networks that pit two neural networks against each other to synthesize realistic data. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch17))
- [Gradient Boosting](https://awesome-repositories.com/f/artificial-intelligence-ml/gradient-boosting.md) — Implements iterative ensemble techniques that build models sequentially to minimize loss. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch07))
- [Graph Neural Networks](https://awesome-repositories.com/f/artificial-intelligence-ml/graph-neural-networks.md) — Implements neural network architectures designed to process and make predictions on graph-structured data. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch18))
- [Image Data Preprocessing](https://awesome-repositories.com/f/artificial-intelligence-ml/image-data-preprocessing.md) — Provides techniques for preparing raw image data and applying augmentations for deep learning model consumption. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch14))
- [Generative Adversarial Image Synthesis](https://awesome-repositories.com/f/artificial-intelligence-ml/image-super-resolution-models/generative-adversarial-image-synthesis.md) — Implements generative adversarial networks for image synthesis and manipulation. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch17))
- [K-Means Clustering](https://awesome-repositories.com/f/artificial-intelligence-ml/k-means-clustering.md) — Partitions unlabeled data into clusters using centroid-based K-means algorithms. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch10))
- [Cluster Count Selection Methods](https://awesome-repositories.com/f/artificial-intelligence-ml/k-means-clustering/cluster-count-selection-methods.md) — Determines the optimal number of clusters using techniques like the elbow method and silhouette plots. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch10))
- [Large Scale Dataset Processing](https://awesome-repositories.com/f/artificial-intelligence-ml/large-scale-dataset-processing.md) — Provides techniques for parallel loading and preprocessing of massive datasets specifically for machine learning research. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch08))
- [Linear Regression](https://awesome-repositories.com/f/artificial-intelligence-ml/linear-regression.md) — Fits linear relationships between input variables and a target value using ordinary least squares. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch09))
- [Text Document Classification](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning-classification/text-document-classification.md) — Implements a logistic regression model to categorize text documents into predefined classes. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch08))
- [Incremental Training](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/machine-learning-training/incremental-training.md) — Implements chunk-based feeding of data to models to enable training on datasets larger than system memory.
- [Model Lifecycle Management](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/machine-learning-training/model-lifecycle-management.md) — Guides the end-to-end process of data preparation, model selection, training, and evaluation. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Language Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/model-fine-tuning-adaptation/language-model-training.md) — Provides tutorials and techniques for training language models, including the combination of pre-training and supervised fine-tuning. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch16))
- [Markov Decision Process Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/markov-decision-process-solvers/markov-decision-process-frameworks.md) — Formulates state-transition systems using Markov Decision Processes to define agent-environment interactions. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch19))
- [Hyperparameter Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/model-fine-tuning-resources/hyperparameter-tuning.md) — Optimizes model settings using grid and randomized search to maximize predictive performance. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch06))
- [Model Fit Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/model-fit-analysis.md) — Analyzes learning and validation curves to diagnose model bias, variance, and overfitting. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch06))
- [Regression Metrics](https://awesome-repositories.com/f/artificial-intelligence-ml/model-predictions/regression-metrics.md) — Calculates quantitative regression metrics to determine the accuracy and fit of continuous numerical predictions. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch06))
- [Nested Cross-Validation](https://awesome-repositories.com/f/artificial-intelligence-ml/model-validation-tools/cross-validation-utilities/cross-validation-technique-descriptions/nested-cross-validation.md) — Implements nested cross-validation to select the best machine learning algorithm while avoiding data leakage.
- [Multilayer Perceptrons](https://awesome-repositories.com/f/artificial-intelligence-ml/multilayer-perceptrons.md) — Implements standard fully connected multilayer perceptron architectures for learning non-linear mappings. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch11))
- [Language Model Pretraining](https://awesome-repositories.com/f/artificial-intelligence-ml/natural-language-processing/language-model-pretraining.md) — Demonstrates how to train transformer architectures on large unlabeled datasets before refining them for specific tasks. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch16))
- [Computation Graphs](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-architectures/computation-graphs.md) — Defines neural network data flow using sequential containers and custom layers to construct complex architectures.
- [Autoencoders](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-implementations/autoencoders.md) — Implements autoencoder architectures that compress input data into hidden representations for feature encoding. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch17))
- [Perceptron Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-implementations/perceptron-implementations.md) — Provides manual implementations of single-layer perceptrons for basic binary classification tasks. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch02))
- [Neural Network Regularization](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-network-regularization.md) — Applies dropout layers as a regularization technique to prevent overfitting in neural networks. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch14))
- [Training Execution Loops](https://awesome-repositories.com/f/artificial-intelligence-ml/neural-training-pipelines/training-execution-loops.md) — Implements the full training process, including parameter initialization and minibatch gradient descent, within interactive notebooks. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch11))
- [Nonlinear Relationship Modeling](https://awesome-repositories.com/f/artificial-intelligence-ml/nonlinear-relationship-modeling.md) — Captures complex patterns in data using polynomial transformations, decision trees, and random forests. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch09))
- [Online Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/online-learning.md) — Implements online learning algorithms that update model parameters continuously as new data arrives.
- [OpenAI Gym Integrations](https://awesome-repositories.com/f/artificial-intelligence-ml/openai-gym-integrations.md) — Connects reinforcement learning algorithms to standardized OpenAI Gym environments for agent validation. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch19))
- [Gradient Descent Algorithms](https://awesome-repositories.com/f/artificial-intelligence-ml/optimization-algorithms/gradient-descent-algorithms.md) — Applies gradient descent and stochastic gradient descent to optimize model parameters during the training process. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch02))
- [Parameter Optimizers](https://awesome-repositories.com/f/artificial-intelligence-ml/parameter-optimizers.md) — Refines model weights using gradient-based optimization algorithms to minimize loss and improve accuracy. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch12))
- [Recurrent Neural Networks](https://awesome-repositories.com/f/artificial-intelligence-ml/recurrent-neural-networks.md) — Implements recurrent neural network architectures for processing sequential data and time-series prediction. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch15))
- [Robust Regression](https://awesome-repositories.com/f/artificial-intelligence-ml/regression-analysis/robust-regression.md) — Implements RANSAC to minimize the influence of outliers when estimating linear model parameters. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch09))
- [Sentiment Analysis Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/sentiment-analysis-tools.md) — Provides implementations for classifying the emotional tone of text using embedding layers and recurrent architectures. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch15))
- [Text Sequence Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/sequence-generation/autoregressive-text-generation/text-sequence-generation.md) — Implements the production of coherent written content by sampling from learned probability distributions. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch15))
- [Sequence Learning Models](https://awesome-repositories.com/f/artificial-intelligence-ml/sequence-learning-models.md) — Implements architectures and training methods for mapping input sequences to output sequences.
- [Sequential and Graph Data Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/sequential-and-graph-data-analysis.md) — Processes non-tabular data structures using a combination of recurrent architectures and graph neural networks. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Linear Discriminant Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/supervised-classification/dimensionality-reduction/linear-discriminant-analysis.md) — Implements linear discriminant analysis as a supervised dimensionality reduction technique to maximize class separability.
- [Synthetic Data Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/synthetic-data-generators.md) — Produces synthetic data samples by training competing generator and discriminator models. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch17))
- [Categorical Encodings](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-numeric-transformations/categorical-encodings.md) — Transforms label-based categorical data into numerical formats using mapping and one-hot encoding. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch04))
- [Latent Dirichlet Allocations](https://awesome-repositories.com/f/artificial-intelligence-ml/topic-modeling-libraries/latent-dirichlet-allocations.md) — Implements Latent Dirichlet Allocation to discover hidden thematic structures within document collections. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch08))

### Part of an Awesome List

- [Predictive Classifiers](https://awesome-repositories.com/f/awesome-lists/ai/general-purpose-machine-learning/predictive-classifiers.md) — Provides practical implementations of diverse supervised learning models including SVMs, Decision Trees, and K-Nearest Neighbors. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch03))
- [Model Fine-Tuning](https://awesome-repositories.com/f/awesome-lists/ai/model-training-and-fine-tuning/model-fine-tuning.md) — Adapts pre-trained models to custom datasets using specialized training interfaces and tokenization. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch16))
- [Training and Regularization](https://awesome-repositories.com/f/awesome-lists/ai/training-and-regularization.md) — Utilizes regularization and slack variables to ensure models generalize effectively to unseen data. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch03))
- [Data Science Projects](https://awesome-repositories.com/f/awesome-lists/data/data-science-projects.md) — Notebooks accompanying a comprehensive machine learning textbook.
- [Learning Resources](https://awesome-repositories.com/f/awesome-lists/learning/learning-resources.md) — A structured book resource for learning machine learning theory and practice.

### Data & Databases

- [Generative Data Synthesis](https://awesome-repositories.com/f/data-databases/custom-persistence-strategies/custom-generator-definitions/adversarial-data-generators/generative-data-synthesis.md) — Employs generative adversarial networks to create artificial data samples that mimic real datasets. ([source](https://github.com/rasbt/machine-learning-book#readme))
- [Data Cleaning Procedures](https://awesome-repositories.com/f/data-databases/data-cleaning-procedures.md) — Implements procedures for filtering and correcting errors in datasets to improve overall data quality. ([source](https://github.com/rasbt/machine-learning-book/blob/main/README.md))
- [Training Data Pipelines](https://awesome-repositories.com/f/data-databases/data-processing-pipelines/data-processing/ml-data-pipelines/training-data-pipelines.md) — Provides pipelines to load and organize raw data into shuffled, batched sets for efficient model training. ([source](https://github.com/rasbt/machine-learning-book/blob/main/ch12))
- [Data Transformation Pipelines](https://awesome-repositories.com/f/data-databases/data-transformation-pipelines.md) — Chains data preprocessing, feature engineering, and model estimation into workflows to optimize data for model consumption.
- [Missing Data Imputation](https://awesome-repositories.com/f/data-databases/missing-data-imputation.md) — Provides methods for filling gaps in tabular datasets using scalar replacement or statistical propagation. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch04))

### Scientific & Mathematical Computing

- [Principal Component Analysis](https://awesome-repositories.com/f/scientific-mathematical-computing/linear-algebra-routines/principal-component-analysis.md) — Implements principal component analysis to extract structural patterns by factoring matrices into low-rank components. ([source](https://github.com/rasbt/machine-learning-book/tree/main/ch05))
