awesome-repositories.comBlog
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPBlogSitemapPrivacyTerms
ML From Scratch | Awesome Repository
← All repositories

eriklindernoren/ML-From-Scratch

0
View on GitHub↗
30,849 stars·5,186 forks·Python·mit·1 view

ML From Scratch

AI search

Explore more awesome repositories

Describe what you need in plain English — the AI ranks thousands of curated open-source projects by relevance.

Let's find more awesome repositories

Features

  • Machine Learning Toolkits - Provides a collection of fundamental algorithms implemented from scratch to demonstrate core learning mechanics.
  • Supervised Learning - Trains models on labeled datasets to predict outcomes or classify observations.
  • Clustering Algorithms - Partition datasets into clusters by iteratively assigning points to the nearest center and updating positions to minimize variance.
  • Deep Learning Architectures - Builds custom deep learning models by stacking layers and activation functions.
  • Multilayer Perceptrons - Solve complex classification problems by defining hidden layers and backpropagation logic to learn non-linear mappings from input data.
  • Random Forest Ensembles - Improve predictive accuracy and reduce variance by combining the outputs of multiple decision trees trained on random data subsets.
  • Recurrent Neural Networks - Implements backpropagation through time to train sequential models by unrolling recurrent structures.
  • Reinforcement Learning - Develops agents that learn to make sequences of decisions to maximize cumulative rewards.
  • Unsupervised Learning - Discovers hidden structures and patterns in unlabeled data using clustering and dimensionality reduction.
  • Educational Examples - Provides a collection of machine learning models and algorithms built from scratch for educational purposes.
  • Machine Learning Tutorials - Provides fundamental machine learning algorithm implementations built from scratch for educational purposes.
  • Ensemble Learning - Combines multiple weak learners into robust models using boosting and bagging strategies.
  • Expectation-Maximization Models - Group data points by iteratively calculating membership probabilities to optimize the fit of statistical distributions to the data.
  • Gradient Boosting - Minimize prediction errors by combining multiple decision trees into a single ensemble model through iterative gradient descent.
  • Logistic Regression Models - Predict the likelihood of binary outcomes by applying the sigmoid function and optimizing weights through gradient descent.
  • Neural Network Architectures - Constructs custom neural networks by defining specific layers, activation functions, and loss functions.
  • Neural Network Frameworks - Constructs complex neural network architectures by chaining independent functional blocks.
  • Clustering Suites - Provides a collection of density and centroid-based algorithms for grouping unlabeled data.
  • Computer Vision - Recognizes visual patterns in images by training convolutional neural networks to extract features.
  • Decision Trees - Builds decision-based models by repeatedly splitting datasets into subsets based on feature thresholds.
  • Deep Learning Frameworks - Constructs custom neural network architectures to demonstrate backpropagation and gradient descent mechanics.
  • Generative Adversarial Networks - Creates synthetic images of handwritten digits by training generative adversarial networks.
  • Naive Bayes Classifiers - Classify data points by calculating the probability of class membership based on the statistical distribution of input features.
  • Optimization Algorithms - Updates model parameters iteratively by calculating partial derivatives of the loss function.
  • Sequential Learning - Processes time-series data by using recurrent neural networks to capture temporal dependencies.
  • Autoencoders - Reduces data complexity by training encoder and decoder networks to reconstruct original inputs.
  • Density-Based Clustering - Identifies clusters of arbitrary shapes by analyzing local data density.
  • Ensemble Learning Libraries - Ships a suite of boosting and bagging implementations to improve predictive performance.
  • Evolutionary Algorithms - Optimizes neural network architectures and weights using evolutionary strategies.
  • Perceptron Classifiers - Create simple linear models by iteratively adjusting weights based on prediction errors to separate labeled data points.
  • Algorithm References - Provides clean, readable code examples for standard statistical and neural network architectures.
  • Association Rule Learning - Discovers frequent patterns and relationships within transactional datasets using the Apriori algorithm.
  • Boltzmann Machines - Represents complex data structures by training Boltzmann machines to learn underlying feature patterns.
  • Dimensionality Reduction - Identify linear combinations of features that maximize class separation to simplify datasets or improve classification performance.
  • Gaussian Mixture Models - Implements Gaussian mixture models to represent complex data distributions using expectation-maximization.
  • Genetic Algorithms - Solves optimization problems by simulating natural selection processes to evolve candidate solutions.
  • K-Medoids Clustering - Groups data points by minimizing total dissimilarity to representative medoids.
  • Numerical Computing Libraries - Performs mathematical operations on multidimensional arrays to accelerate linear algebra calculations.
  • This project is an educational toolkit that provides implementations of fundamental machine learning algorithms built from scratch. By avoiding high-level library abstractions, it serves as a pedagogical reference for understanding the mathematical foundations and core mechanics of supervised learning, unsupervised learning, and reinforcement learning models.

    The repository distinguishes itself through a modular approach to model construction, allowing users to build custom neural networks by chaining independent functional blocks. It covers a wide range of techniques, including gradient-based weight optimization, backpropagation through time for sequential data, and ensemble-based aggregation methods like boosting and bagging. These implementations rely on vectorized computation to perform linear algebra operations, providing a transparent view into how models learn from data.

    The collection encompasses a broad capability surface, ranging from classic statistical methods and decision trees to complex deep learning architectures and clustering algorithms. It includes resources for training agents in dynamic environments, performing dimensionality reduction, and discovering patterns in unlabeled datasets. The project is structured as a comprehensive reference, with documentation and installation instructions provided to help users configure their local environments for experimentation.