# yandexdataschool/practical_rl

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/yandexdataschool-practical-rl).**

6,522 stars · 1,807 forks · Jupyter Notebook · Unlicense

## Links

- GitHub: https://github.com/yandexdataschool/Practical_RL
- awesome-repositories: https://awesome-repositories.com/repository/yandexdataschool-practical-rl.md

## Topics

`course-materials` `deep-learning` `deep-reinforcement-learning` `git-course` `hacktoberfest` `keras` `mooc` `pytorch` `pytorch-tutorials` `reinforcement-learning` `tensorflow`

## Description

Practical_RL is a comprehensive educational curriculum and course for learning to design and implement agents that solve complex decision processes. It provides a structured study program covering the fundamentals of reinforcement learning, from basic trial-and-error behavior to advanced deep reinforcement learning.

The project includes specialized guides and frameworks for imitation learning based on expert demonstrations, model-based reinforcement learning using planners, and the training of recurrent neural networks to solve partially observed environments.

The materials cover a broad range of machine learning capabilities, including the implementation of value-based and policy-gradient algorithms, the design of exploration strategies, and the construction of deep learning foundations for continuous state spaces.

The content is delivered primarily through Jupyter Notebooks.

## Tags

### Artificial Intelligence & ML

- [Deep Reinforcement Learning Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-q-learning-implementations/deep-reinforcement-learning-implementations.md) — Provides functional implementations of advanced reinforcement learning agents using deep neural networks.
- [Reinforcement Learning Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/reinforcement-learning-implementations.md) — Implements a wide range of reinforcement learning agents, including value-based, model-free, and policy-gradient algorithms. ([source](https://github.com/yandexdataschool/practical_rl#readme))
- [Experience Replay Buffers](https://awesome-repositories.com/f/artificial-intelligence-ml/experience-replay-buffers.md) — Implements memory structures that store agent transitions to break temporal correlations during neural network training.
- [Expert Imitation Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/expert-imitation-learning.md) — Implements frameworks for training agents to mimic expert demonstrations using inverse reinforcement learning and behavior cloning. ([source](https://github.com/yandexdataschool/practical_rl#readme))
- [Policy and Value Function Approximators](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/model-construction/neural-network-layers/policy-and-value-function-approximators.md) — Constructs neural network architectures specifically to estimate values and action probabilities in reinforcement learning.
- [Policy Gradient Methods](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/machine-learning-training/utilities/gradient-optimization-techniques/policy-gradient-methods.md) — Provides gradient-based architectures for updating policy parameters in discrete and continuous action spaces.
- [Partially Observable MDP Solvers](https://awesome-repositories.com/f/artificial-intelligence-ml/partially-observable-mdp-solvers.md) — Provides architectures using recurrent neural networks to solve decision processes in environments with hidden state information. ([source](https://github.com/yandexdataschool/practical_rl#readme))
- [Behavioral Agent Training Environments](https://awesome-repositories.com/f/artificial-intelligence-ml/agent-training-environment-platforms/multi-agent-training-environments/behavioral-agent-training-environments.md) — Provides a training environment for solving complex tasks through trial and error across robotics and finance domains. ([source](https://github.com/yandexdataschool/Practical_RL/wiki/Practical-RL))
- [Imitation Learning Guides](https://awesome-repositories.com/f/artificial-intelligence-ml/expert-imitation-learning/imitation-learning-guides.md) — Provides materials for developing behavioral policies based on expert demonstrations and inverse reinforcement learning techniques.
- [Exploration Strategies](https://awesome-repositories.com/f/artificial-intelligence-ml/exploration-strategies.md) — Provides tools and implementations for balancing exploration and exploitation using methods like Thompson Sampling and Monte Carlo Tree Search. ([source](https://github.com/yandexdataschool/practical_rl#readme))
- [Model-Based Reinforcement Learning Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/model-based-reinforcement-learning-frameworks.md) — Provides a guide to designing exploration strategies and planners using Monte Carlo Tree Search and Thompson Sampling.
- [Monte Carlo Tree Search](https://awesome-repositories.com/f/artificial-intelligence-ml/monte-carlo-tree-search.md) — Implements a search algorithm that uses random sampling of the game tree to determine optimal moves.
- [Recurrent Neural Network Training](https://awesome-repositories.com/f/artificial-intelligence-ml/recurrent-neural-network-training.md) — Provides frameworks for building and training recurrent networks with parallelized computational efficiencies.
- [Memory Architectures for Partially Observable MDPs](https://awesome-repositories.com/f/artificial-intelligence-ml/recurrent-neural-networks/gated-recurrent-units/gated-linear-recurrent-layers/memory-architectures-for-partially-observable-mdps.md) — Integrates hidden state tracking via recurrent layers to handle partially observable environments and temporal dependencies.
- [Target Network Decoupling](https://awesome-repositories.com/f/artificial-intelligence-ml/target-network-decoupling.md) — Uses a separate, slowly updating set of weights to prevent divergence during value function approximation.

### Part of an Awesome List

- [Deep Learning Foundations](https://awesome-repositories.com/f/awesome-lists/ai/deep-learning-foundations.md) — Provides core frameworks and educational resources for machine learning and neural networks. ([source](https://github.com/yandexdataschool/Practical_RL/wiki/Practical-RL))

### Education & Learning Resources

- [Reinforcement Learning Curricula](https://awesome-repositories.com/f/education-learning-resources/deep-learning-curriculum/reinforcement-learning-curricula.md) — Provides structured learning paths specifically for the study of reinforcement learning.
