awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Model Training Pipelines · Awesome GitHub Repositories

5 repos

Awesome GitHub RepositoriesModel Training Pipelines

End-to-end workflows and scripts for sourcing datasets, training models, and validating performance across various machine learning tasks.

Explore 5 awesome GitHub repositories matching artificial intelligence & ml · Model Training Pipelines. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Machine Learning
  4. Infrastructure
  5. Training & Tuning
  6. Training Frameworks
  7. Model Training Pipelines

Awesome Model Training Pipelines GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • d2l-ai/d2l-zh

    d2l-ai/d2l-zh

    75,708GitHubView on GitHub↗

    This project is an open-source, interactive educational platform designed to teach deep learning through a comprehensive, code-first curriculum. It provides a structured learning path that covers foundational mathematics, modern neural network architectures, and practical optimization techniques, enabling practitioners

    Organizes end-to-end workflows that manage data sourcing, model training, and performance validation.

    Pythonbookchinesecomputer-vision
  • awesomedata/awesome-public-datasets

    awesomedata/awesome-public-datasets

    72,846GitHubView on GitHub↗

    This project is a community-maintained, open-access directory of high-quality public datasets. It serves as a centralized reference point for researchers, developers, and data scientists to locate reliable information sources across a wide spectrum of industries and scientific fields. By providing a structured index, t

    Supplies a diverse collection of labeled datasets essential for training, validating, and benchmarking predictive models.

    aaron-swartzawesome-public-datasetsdatasets
  • tesseract-ocr/tesseract

    tesseract-ocr/tesseract

    72,460GitHubView on GitHub↗

    Tesseract is a neural network-based optical character recognition engine designed to convert scanned images and digital documents into machine-readable, searchable text. It functions as both a command-line utility for automating large-scale digitization workflows and a cross-platform library that can be embedded into d

    Construct custom recognition models using provided training scripts and makefiles to optimize performance for specific document types.

    C++hacktoberfestlstmmachine-learning
  • CorentinJ/Real-Time-Voice-Cloning

    CorentinJ/Real-Time-Voice-Cloning

    59,355GitHubView on GitHub↗

    This project is a neural text-to-speech engine and voice cloning toolkit designed to generate synthetic speech that mimics the vocal characteristics of a target speaker. It functions as a real-time audio synthesizer, utilizing a deep learning pipeline to convert written text into high-fidelity speech output with minima

    Automates the end-to-end workflow for sourcing data, training neural models, and validating synthesis performance.

    Pythondeep-learningpythonpytorch
  • karpathy/nanoGPT

    karpathy/nanoGPT

    53,461GitHubView on GitHub↗

    nanoGPT is a lightweight engine for training and fine-tuning transformer-based language models from scratch. It provides a minimalist codebase designed for educational exploration and rapid experimentation with neural network architectures, utilizing self-attention and feed-forward layers to process sequences and predi

    Coordinates end-to-end workflows for training and fine-tuning models across various hardware accelerators.

    Python