30 open-source projects similar to agentica-project/deepscaler, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Deepscaler alternative.
Qwen2.5 is a suite of large language model foundation models designed for natural language generation, code production, and complex mathematical reasoning. The project encompasses a multilingual language model capable of processing dozens of languages and a specialized code generation model for technical problem solving and debugging. The framework is distinguished by its long context capabilities, enabling the analysis of massive inputs ranging from 256K up to 1 million tokens. It further functions as an agentic framework, utilizing standardized templates and parsers to execute autonomous wo
🚀 Reinforcement Learning for Language Agents🌟
DeepSeek-R1 is an open-weights large language model focused on advanced reasoning. It uses chain-of-thought processing and internal monologues to solve complex mathematical and logical problems by breaking tasks into sequential, verifiable thought processes. The model is developed using reinforcement learning to optimize reasoning patterns and verify logical steps. It employs a distillation process to transfer these high-performance logic capabilities from a large teacher model into smaller, computationally efficient versions. The training framework incorporates group relative policy optimiz
Reinforcement Learning Tuning for VideoLLMs: Reward Design and Data Efficiency
From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓
Exploring Applications of GRPO
An Open Large Reasoning Model for Real-World Solutions
Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"
Checkpoints take up a lot of space. Please email yninghong@gmail.com if you need them.
🧐 About | 🚀 Quick Start | 🐣 Agentless Mini | 📝 Citation | 🙏 Acknowledgements
R1-onevision, a visual language model capable of deep CoT reasoning.
OpenSeek aims to unite the global open source community to drive collaborative innovation in algorithms, data and systems to develop next-generation models.
Medical o1, Towards medical complex reasoning with LLMs
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.
EasyR1 is a distributed model training system and reinforcement learning framework for large language and vision-language models. It functions as a multimodal trainer and an implementation of a Proximal Policy Optimization pipeline designed to refine the reasoning and perception capabilities of models that process both text and images. The system specializes in distributing reinforcement learning workloads across multiple compute nodes to manage high memory requirements. It optimizes hardware utilization through padding-free training and fine-tuning to fit large models onto available graphics