awesome-repositories.comBlog
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPBlogSitemapPrivacyTerms
Training Data Curators · Awesome GitHub Repositories

1 repo

Awesome GitHub RepositoriesTraining Data Curators

Tools for cleaning, filtering, and synthesizing high-quality datasets.

Distinguishing note: Focuses on the end-to-end curation lifecycle for reasoning-task datasets.

Explore 1 awesome GitHub repository matching artificial intelligence & ml · Training Data Curators. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Training Data Curators

Awesome Training Data Curators GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • huggingface/open-r1

    huggingface/open-r1

    25,887View on GitHub↗

    Open-r1 is a framework designed for the large-scale training, distillation, and optimization of language models focused on complex reasoning and programming tasks. It provides a comprehensive suite of tools for managing distributed training jobs across multi-node clusters, enabling the development of high-performance models through reinforcement learning and supervised fine-tuning. The project distinguishes itself by integrating secure, containerized code execution environments directly into the training and evaluation lifecycle. By allowing models to run and verify code snippets against test

    Cleans, filters, and synthesizes high-quality datasets to ensure model integrity and improve performance on specialized reasoning tasks.

    Python
    25,887View on GitHub↗