awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Training Data Composers · Awesome GitHub Repositories

1 repo

Awesome GitHub RepositoriesTraining Data Composers

Tools for mixing and managing diverse datasets for model training.

Distinguishing note: Focuses on configuration-based dataset mixture management for fine-tuning.

Explore 1 awesome GitHub repository matching artificial intelligence & ml · Training Data Composers. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Training Data Composers

Awesome Training Data Composers GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • huggingface/open-r1

    huggingface/open-r1

    25,887View on GitHub↗

    Open-r1 is a framework designed for the large-scale training, distillation, and optimization of language models focused on complex reasoning and programming tasks. It provides a comprehensive suite of tools for managing distributed training jobs across multi-node clusters, enabling the development of high-performance models through reinforcement learning and supervised fine-tuning. The project distinguishes itself by integrating secure, containerized code execution environments directly into the training and evaluation lifecycle. By allowing models to run and verify code snippets against test

    Combines different data sources by modifying configuration files to create custom dataset mixtures for flexible training.

    Python
    25,887View on GitHub↗