30 open-source projects similar to dlr-rm/rl-baselines3-zoo, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Rl Baselines3 Zoo alternative.
Dopamine is a reinforcement learning research framework designed for prototyping and testing algorithms across diverse simulated environments. It provides an agent development toolkit that utilizes a flat class hierarchy to facilitate the creation and extension of learning agents. The framework includes a standardization layer via environment wrappers that connect agents to various physics simulations and gaming environments. It also features a high-performance experience replay buffer for storing and sampling transition data to improve training stability, alongside a dedicated hyperparameter
Baselines is a comprehensive suite of frameworks for reinforcement learning algorithm implementation, imitation learning, and training orchestration. It provides a library of standardized learning algorithms used to benchmark and replicate research results, alongside a deep learning policy framework for constructing neural network architectures such as multi-layer perceptrons, convolutional networks, and long short-term memory networks. The project includes a specialized imitation learning toolkit that enables agents to mimic expert behavior through behavior cloning and generative adversarial
CleanRL is a reinforcement learning library and PyTorch framework providing a suite of reproducible implementations for online reinforcement learning algorithms. It serves as a deep reinforcement learning benchmark suite and experiment orchestrator designed for research and agent development across both discrete and continuous action spaces. The project is distinguished by its single-file algorithm implementation approach, which encapsulates each algorithm in a standalone script to eliminate complex class hierarchies. This structure is paired with a system for scheduling and executing large-s
Stable-baselines3 is a reinforcement learning library built on the PyTorch deep learning framework. It provides a collection of reliable, standardized implementations of reinforcement learning algorithms designed for training, testing, and benchmarking agent policies in diverse simulated environments. The library functions as an agent training toolkit that emphasizes modularity and reproducibility. It features a unified environment interface and supports vectorized execution to accelerate data collection across multiple simulation instances. Users can customize neural network architectures, f
RLcard is an open-source framework for developing and evaluating reinforcement learning agents across multiple card game environments. It functions as a card game environment simulator, a multi-agent RL platform, and a benchmarking toolkit for algorithms like DQN, NFSP, and CFR. The framework provides a game-agnostic environment interface that decouples agent logic from game mechanics, allowing any policy to interact through a common API. It supports pluggable reinforcement learning algorithms that operate on this interface without modifying game logic, and includes a self-play training loop
Skorch is a library that wraps PyTorch neural networks in a scikit-learn compatible interface, allowing deep learning models to be used within standard machine learning pipelines and hyperparameter optimization tools. It functions as a data adapter, training manager, and optimization tool that bridges the gap between deep learning modules and conventional machine learning workflows. The project distinguishes itself by providing a toolkit for automating the PyTorch training lifecycle, including integrated checkpointing, early stopping, and learning rate scheduling. It further enables transfer
This project is a fine-tuning framework and training pipeline designed to optimize and adapt large language and vision models. It provides a specialized toolkit for parameter-efficient tuning and supervised learning, serving as both a trainer for multimodal models and a deployment tool for serving fine-tuned models via high-performance inference engines. The framework focuses on reducing memory and compute requirements by updating a small subset of model parameters. It supports a wide range of adaptation strategies, including vision-language model training to align text, image, video, and aud
keras-rl is a reinforcement learning library that enables the training of neural agents using Keras. It serves as a framework for implementing deep reinforcement learning agents that interact with simulated environments to discover optimal behaviors and maximize cumulative rewards. The library provides a system for configuring, training, and managing neural network agents. It handles the interaction loop between agents and environments, allowing models to learn through direct experience and gradient-based optimization. The framework includes capabilities for model weight management, allowing
This is a PyTorch-based toolkit for training reinforcement learning agents, providing implementations of standard and hierarchical deep RL algorithms. It is designed as a library for deep reinforcement learning research and experimentation, supporting both discrete and continuous control tasks through a collection of algorithm implementations. The project distinguishes itself by offering a hierarchical reinforcement learning framework that decomposes complex long-horizon tasks into manageable sub-goals using meta-controllers and lower-level policies. It also includes a Hindsight Experience Re
rllm is an asynchronous reinforcement learning framework for training language agents. It provides a unified pipeline that runs the same agent code for both evaluation and training, automatically capturing traces for gradient computation. The framework supports distributed reinforcement learning across multiple GPUs and nodes using pluggable backends, and executes agents in isolated sandboxes—either locally or in the cloud—for safe and scalable rollout collection. It trains agents built with LangGraph, SmolAgents, OpenAI Agents SDK, or custom frameworks without requiring core logic changes. T
This project is a comprehensive deep reinforcement learning course and training platform. It provides a structured educational curriculum that combines theoretical lessons with hands-on tutorials to teach the implementation of neural networks and agent behavior. The platform integrates a model sharing hub where users can upload, download, and version trained machine learning models. It also features a benchmarking system that uses leaderboards to evaluate and compare agent performance against community standards. The educational experience is delivered through interactive notebooks and inclu
MMF is a modular framework for building, training, and evaluating vision-and-language models. It provides a configuration-driven experiment system where model, dataset, and training parameters are defined through composable YAML files, alongside a curated model zoo of pretrained checkpoints for state-of-the-art multimodal architectures. The framework includes a multimodal dataset loader that downloads, processes, and batches vision-and-language data, and a vision-language model trainer supporting distributed training, mixed precision, and checkpoint-based resumption. The framework distinguish
This project is a game AI training framework designed to develop and monitor reinforcement learning agents within a legacy game environment. It functions as a training and monitoring system that optimizes autonomous agents to complete game objectives through exploration and reward-based learning. The framework includes tools for game memory mapping and real-time trajectory visualization. These capabilities translate raw game memory addresses into visual coordinates, allowing agent movements and session data to be streamed to a map for the analysis of navigation patterns and area exploration.
This project is a comprehensive collection of educational examples and reference implementations for building vision and language models using PyTorch. It serves as a deep learning tutorial covering the end-to-end process of developing neural networks, from initial architecture definition to final production deployment. The repository provides detailed guides on implementing a wide range of domain-specific models, including convolutional neural networks for object detection and segmentation, as well as transformer and recurrent architectures for natural language processing. It emphasizes gene
This project is a standardized machine learning experiment boilerplate and project template that combines PyTorch Lightning with the Hydra configuration framework. It provides a structured codebase for organizing deep learning workflows, specifically designed to integrate hierarchical configuration management with distributed training. The template features a specialized workflow for hyperparameter optimization and batch experiment execution, allowing for automated parameter sweeps without modifying source code. It employs a hierarchical system for managing settings via YAML files and command
Youtu Agent is an open-source framework for building, running, and evaluating autonomous agents powered by large language models. It provides the core infrastructure for creating agents that follow reasoning loops, use toolkits, and coordinate with other agents to solve complex tasks, all managed through YAML-driven configuration files. The framework distinguishes itself through its support for multi-agent orchestration, where a planner agent decomposes tasks and coordinates specialized worker agents, and through its integration with the Model Context Protocol for connecting to external toolk
This project is an LLM research orchestrator and autonomous AI agent framework designed to automate the scientific lifecycle. It functions as an end-to-end research pipeline and model training toolkit, managing everything from initial literature reviews and hypothesis testing to the final drafting of academic papers. The system is distinguished by its ability to convert unstructured academic PDFs into machine-executable knowledge layers, allowing agents to reproduce and extend research findings. It employs a two-loop orchestration architecture and a specialized research engineering skill libr
This repository is a collection of Jupyter notebooks providing reference implementations and templates for building, training, and deploying machine learning models using Amazon SageMaker. It serves as an example library for implementing model architectures and automating the machine learning lifecycle. The library provides practical patterns for machine learning training, data engineering, and model deployment. It includes implementation guides for MLOps, including workflows for model monitoring, lineage tracking, and hyperparameter tuning. The examples cover a broad range of capabilities i
This project is a structured learning curriculum and technical reference for mastering deep learning with TensorFlow. It provides a comprehensive guide for building, training, and deploying neural networks, combining theoretical fundamentals with practical implementation examples. The repository distinguishes itself by covering the end-to-end machine learning workflow, from low-level tensor mathematics and linear algebra to the creation of complex model architectures. It includes specific guidance on developing data pipelines for diverse data types, such as images, text, and time-series seque
This project is an educational resource designed to teach the mathematical foundations and core algorithms of reinforcement learning. It provides a structured academic curriculum that combines textbooks, lecture materials, and practical code examples to guide learners through the principles of Markov decision processes and reinforcement learning theory. The repository distinguishes itself by integrating a grid-based simulation framework that allows users to test algorithms within custom environments. This environment supports the analysis of agent performance by rendering state values, polici
LLaMA-Factory is a comprehensive suite for dataset preparation, model fine-tuning, memory optimization, and standardized API deployment. It provides a unified platform for the supervised and reward-based fine-tuning of large language models and vision-language models. The framework includes a specialized toolkit for training vision-language models and a model serving interface that deploys trained models through high-performance APIs. It utilizes precision tuning and quantization techniques to reduce the hardware requirements and memory footprint of large models. The system covers data pipel
PyTorch Lightning is a high-level deep learning framework for PyTorch that automates training loops and removes repetitive engineering boilerplate. It functions as a structured pipeline for managing machine learning experiments, providing a distributed training orchestrator and tools for mixed-precision training. The framework decouples scientific model architecture from the engineering required for infrastructure and scaling. This separation allows the same model code to execute across CPUs, GPUs, or TPUs through a hardware-agnostic execution engine and a centralized trainer that manages the
This is an interactive notebook-based course that teaches machine learning from Python fundamentals through deep learning and natural language processing. It uses real datasets and multiple frameworks within a structured, hands-on curriculum that combines concise explanations with executable code cells, built-in datasets, and embedded exercise checkpoints. Learning progresses through data preparation and exploration, classical machine learning workflows, computer vision with convolutional neural networks, and natural language processing with deep learning, all delivered as a cohesive progressi
gpt-neox is a distributed training system and framework for building large-scale autoregressive language models. It implements the transformer architecture and provides a toolkit for training models with billions of parameters by distributing weights across compute clusters. The framework distinguishes itself through extensive support for distributed model parallelism, including pipeline and sequence parallelism, to overcome single-device memory limits. It further supports sparse model architectures using a mixture of experts system with Sinkhorn-based routing. The project covers a broad ran
Ultralytics is a comprehensive computer vision framework designed for training, validating, and deploying deep learning models across a wide range of visual recognition tasks. It provides a unified interface for core operations including object detection, instance segmentation, pose estimation, and image classification. By utilizing a modular architecture, the platform allows users to swap model components to balance inference speed and accuracy requirements for diverse applications. The framework distinguishes itself through its support for real-time processing and flexible deployment. It in
Easy-RL is an educational resource designed to teach the principles and implementation of reinforcement learning. It provides a structured curriculum that guides users from fundamental concepts to advanced algorithmic techniques, focusing on the development and training of autonomous agents that learn through interaction with simulated environments. The project distinguishes itself through a pedagogical framework that utilizes interactive notebooks to bridge the gap between theoretical research and functional code. By organizing complex methods into modular units, it allows for the study of i
This project is an educational repository of reinforcement learning agents and tutorials implemented using TensorFlow. It provides a practical codebase for both model-free and model-based learning agents, designed to demonstrate how AI agents learn through trial and error. The collection features detailed implementations of various algorithmic approaches, including Deep Q-Networks and Policy Gradient methods. It specifically covers Actor-Critic architectures for continuous and discrete action spaces, alongside Proximal Policy Optimization and Deep Deterministic Policy Gradients. The framewor
RF-DETR is a Python library for training and deploying object detection, instance segmentation, and keypoint detection models built on a vision transformer architecture. It provides a unified command-line interface and Python API for the full workflow, from fine-tuning pretrained checkpoints on custom datasets to running inference on images, video files, and live camera streams. The project supports training on datasets in COCO or YOLO format, with automatic format detection and configurable augmentation pipelines. Models can be exported to ONNX, TFLite, or TensorRT for deployment across edge
Minigo is a TensorFlow-based reinforcement learning engine designed to master the game of Go. It functions as a comprehensive system for training neural networks to predict board policies and game outcomes, utilizing a model trainer to generate self-play data and optimize weights. The project is distinguished by its ability to perform large-scale game simulations using Kubernetes to distribute worker nodes across CPU, GPU, and TPU hardware. It employs a Monte Carlo Tree Search implementation to identify optimal moves and supports specialized hardware acceleration, including inference on Edge
This project is a comprehensive collection of practical code examples and implementation libraries for machine learning. It provides a wide array of reference materials for building supervised, unsupervised, and reinforcement learning algorithms. The repository serves as a multi-domain resource, featuring specific implementation suites for financial AI, Bayesian statistical modeling, and deep learning architectures. It includes a framework for training intelligent agents using policy gradients and actor-critic models, as well as practical guides for fine-tuning transformers and utilizing larg