1 repo
Comprehensive frameworks for orchestrating distributed training of large language models.
Distinguishing note: Focuses on the end-to-end orchestration of multi-node training jobs.
Explore 1 awesome GitHub repository matching artificial intelligence & ml · Large Scale Training Suites. Refine with filters or upvote what's useful.
Open-r1 is a framework designed for the large-scale training, distillation, and optimization of language models focused on complex reasoning and programming tasks. It provides a comprehensive suite of tools for managing distributed training jobs across multi-node clusters, enabling the development of high-performance models through reinforcement learning and supervised fine-tuning. The project distinguishes itself by integrating secure, containerized code execution environments directly into the training and evaluation lifecycle. By allowing models to run and verify code snippets against test
Orchestrates distributed training jobs across multi-node computing clusters to scale the development of high-performance language models.