What are the best open-source alternatives to Direct Preference Optimization?

30 open-source projects similar to eric-mitchell/direct-preference-optimization, ranked by shared features. Top picks: internlm/xtuner, allenai/open-instruct, huggingface/alignment-handbook, nndl/llm-beginner, inclusionai/areal, eleutherai/gpt-neox, thinking-machines-lab/tinker-cookbook, meta-pytorch/torchtune, openrlhf/openrlhf, carperai/trlx.

Is internlm/xtuner a good alternative to Direct Preference Optimization?

xtuner is a comprehensive training engine for large language models, offering a toolkit for pre-training, supervised fine-tuning, and the optimization of vision-language multimodal models. It serves as a distributed training accelerator and a specialized framework for scaling Mixture-of-Experts mod…

Is allenai/open-instruct a good alternative to Direct Preference Optimization?

Open-Instruct is a distributed training and instruction tuning framework for large language models. It functions as a coordinator for supervised fine-tuning, reinforcement learning from human feedback pipelines, and tool-use training, providing specialized roles for dataset curation and model align…

Is huggingface/alignment-handbook a good alternative to Direct Preference Optimization?

This project is an alignment framework and suite of pipelines for training language models using supervised fine-tuning and preference optimization. It provides tools for executing large-scale distributed training across multiple GPUs and compute nodes, alongside a system for measuring model helpfu…

Is nndl/llm-beginner a good alternative to Direct Preference Optimization?

This project is a collection of educational resources and technical guides focused on the development and implementation of large language models. It provides a comprehensive curriculum covering transformer architectures, training methods, and deployment strategies. The materials provide detailed…

Is inclusionai/areal a good alternative to Direct Preference Optimization?

AReaL is a system for agent orchestration, distributed model training, and parameter-efficient tuning. It provides a framework for developing multi-turn reasoning agents and training large models using reinforcement learning from human feedback. The project implements a toolkit for improving the v…

Is eleutherai/gpt-neox a good alternative to Direct Preference Optimization?

gpt-neox is a distributed training system and framework for building large-scale autoregressive language models. It implements the transformer architecture and provides a toolkit for training models with billions of parameters by distributing weights across compute clusters. The framework distingu…

Is thinking-machines-lab/tinker-cookbook a good alternative to Direct Preference Optimization?

Tinker Cookbook is an open-source framework for fine-tuning large language models, supporting supervised learning, reinforcement learning, and parameter-efficient techniques like LoRA adapters. It provides a complete pipeline for aligning models with human preferences through multi-stage RLHF workf…

Is meta-pytorch/torchtune a good alternative to Direct Preference Optimization?

Torchtune is a PyTorch-native library for fine-tuning, aligning, and quantizing large language models. It provides a config-driven system for instantiating components, orchestrating distributed training, and managing parameter-efficient fine-tuning with quantization support, all through YAML-based…

Is openrlhf/openrlhf a good alternative to Direct Preference Optimization?

OpenRLHF is a training framework and alignment library designed for reinforcement learning from human feedback across distributed GPU clusters. It provides tools for aligning large language models and multimodal vision-language models using algorithms such as PPO, GRPO, and DPO. The framework dist…

Is carperai/trlx a good alternative to Direct Preference Optimization?

trlx is a reinforcement learning library and training framework designed to align large language models using human feedback. It serves as a distributed trainer and compute orchestrator for scaling high-parameter models across multiple GPUs and nodes. The project provides tools for reinforcement l…

Back to eric-mitchell/direct-preference-optimization

Open-source alternatives to Direct Preference Optimization

30 open-source projects similar to eric-mitchell/direct-preference-optimization, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Direct Preference Optimization alternative.

internlm/xtuner
InternLM/xtuner
5,150View on GitHub
xtuner is a comprehensive training engine for large language models, offering a toolkit for pre-training, supervised fine-tuning, and the optimization of vision-language multimodal models. It serves as a distributed training accelerator and a specialized framework for scaling Mixture-of-Experts models and aligning model behavior through reinforcement learning from human feedback. The project distinguishes itself through advanced memory and compute optimizations, such as sequence parallelism for ultra-long context windows and interleaved pipeline parallelism to reduce GPU idle time. It provide
Pythonagentdeepseek-v3gpt-oss
View on GitHub5,150
allenai/open-instruct
allenai/open-instruct
3,586View on GitHub
Open-Instruct is a distributed training and instruction tuning framework for large language models. It functions as a coordinator for supervised fine-tuning, reinforcement learning from human feedback pipelines, and tool-use training, providing specialized roles for dataset curation and model alignment. The project distinguishes itself through a high-performance training architecture that utilizes actor-based distributed coordination and hybrid sharding to manage large GPU clusters. It implements advanced alignment techniques including direct preference optimization, group relative policy opt
Python
View on GitHub3,586
huggingface/alignment-handbook
huggingface/alignment-handbook
5,621View on GitHub
This project is an alignment framework and suite of pipelines for training language models using supervised fine-tuning and preference optimization. It provides tools for executing large-scale distributed training across multiple GPUs and compute nodes, alongside a system for measuring model helpfulness and dialogue quality through single-turn and multi-turn benchmarks. The framework includes specialized tools for direct preference optimization to refine model behavior using paired data without a separate reward model. It also supports constitutional AI alignment and the training of reward mo
Python
View on GitHub5,621

Open-source alternatives to Direct Preference Optimization

InternLM/xtuner

allenai/open-instruct

huggingface/alignment-handbook

nndl/llm-beginner

inclusionAI/AReaL

EleutherAI/gpt-neox

thinking-machines-lab/tinker-cookbook

meta-pytorch/torchtune

OpenRLHF/OpenRLHF

carperai/trlx

philschmid/deep-learning-pytorch-huggingface

PaddlePaddle/PaddleNLP

shibing624/MedicalGPT

snowkylin/tensorflow-handbook

Infrasys-AI/AIInfra

kenshohara/3D-ResNets-PyTorch

microsoft/DeepSpeed

volcengine/verl

facebookresearch/pythia

Infrasys-AI/AISystem

pytorch/torchtune

NVIDIA-NeMo/NeMo

mlfoundations/open_clip

mindspore-ai/mindspore

afshinea/stanford-cme-295-transformers-large-language-models

OpenLMLab/MOSS

FedML-AI/FedML

lvwerra/trl

facebookresearch/metaseq

google-research/text-to-text-transfer-transformer