Deep-AgentR1-V

View on GitHub

0 stars0 forks0 views

R1 V

Features

Reasoning Models - Vision-capable reasoning model implementation.
Reinforcement Learning Frameworks - Reinforcement learning implementation for visual reasoning agents.

Open-source alternatives to R1 V

Similar open-source projects, ranked by how many features they share with R1 V.

hiyouga/easyr1
hiyouga/EasyR1
5,034View on GitHub
EasyR1 is a distributed model training system and reinforcement learning framework for large language and vision-language models. It functions as a multimodal trainer and an implementation of a Proximal Policy Optimization pipeline designed to refine the reasoning and perception capabilities of models that process both text and images. The system specializes in distributing reinforcement learning workloads across multiple compute nodes to manage high memory requirements. It optimizes hardware utilization through padding-free training and fine-tuning to fit large models onto available graphics
Python
View on GitHub5,034
inclusionai/areal
inclusionAI/AReaL
3,559View on GitHub
AReaL is a system for agent orchestration, distributed model training, and parameter-efficient tuning. It provides a framework for developing multi-turn reasoning agents and training large models using reinforcement learning from human feedback. The project implements a toolkit for improving the visual reasoning and geometry problem solving capabilities of vision-language models. It utilizes a memory-efficient tuning system to optimize mathematical and reasoning models across different inference backends. The infrastructure supports large-scale training through tensor, pipeline, and expert p
Pythonagentllmllm-agent
View on GitHub3,559
agentica-project/rllm
agentica-project/rllm
400View on GitHub
🚀 Reinforcement Learning for Language Agents🌟
Jupyter Notebook
View on GitHub400
jiayi-pan/tinyzero
Jiayi-Pan/TinyZero
13,168View on GitHub
TinyZero is a reinforcement learning framework and implementation designed to train language models to develop reasoning and self-verification abilities. It provides a training pipeline to optimize model performance on mathematical and logical tasks. The project serves as a minimal reproduction of the DeepSeek R1 architectural and training approach. It focuses on creating reasoning models that can solve structured problems through autonomous chain-of-thought discovery. The framework incorporates group relative policy optimization and reward-based self-correction to improve accuracy on logica
Python
View on GitHub13,168

See all 30 alternatives to R1 V

R1 V

Features

Open-source alternatives to R1 V

hiyouga/EasyR1

inclusionAI/AReaL

agentica-project/rllm

Jiayi-Pan/TinyZero

Star history

Open-source alternatives to R1 V

hiyouga/EasyR1

inclusionAI/AReaL

agentica-project/rllm

Jiayi-Pan/TinyZero