1 repo

Awesome GitHub RepositoriesSupervised Fine-Tuning Datasets

Tools for formatting and structuring dialogue and instruction-following data for supervised model training.

Distinguishing note: Focuses on multi-turn conversation and tool-calling formats for instruction tuning, distinct from raw pretraining or preference alignment.

Explore 1 awesome GitHub repository matching artificial intelligence & ml · Supervised Fine-Tuning Datasets. Refine with filters or upvote what's useful.

Find the best repos with AI.We'll search the best matching repositories with AI.

jingyaogong/minimind
jingyaogong/minimind
39,663View on GitHub
This project is a comprehensive framework for the entire lifecycle of transformer-based language models, supporting everything from foundational pretraining to specialized deployment. It provides a modular toolkit for defining neural network architectures, managing data preparation pipelines, and executing training routines across various scales. The framework is designed to handle the full model development process, including supervised fine-tuning, behavioral alignment, and the integration of agentic capabilities. What distinguishes this framework is its focus on efficient training and adva
The framework supports structuring supervised datasets into multi-turn conversation formats, including dialogue and tool-calling sequences to improve model instruction following and task performance.
Pythonartificial-intelligencelarge-language-model
39,663View on GitHub