Mctx

mctx is a framework for executing high-performance tree search and state simulations to generate policy targets for neural networks. It functions as a compiled search engine and neural dynamics simulator that predicts state transitions and rewards using learned representations.

The project implements a vectorised tree search capable of running parallel search operations across input batches. It utilizes a policy target generator to convert search results into action weights used for training and refining neural network policies.

The system covers reinforcement learning workflows by integrating neural environment simulation with model-based policy implementation. It facilitates the distillation of search outcomes into high-quality training targets for neural network policy optimization.

Features

Neural Dynamics Models - Integrates learned representation and transition functions to predict future states and rewards within a search tree.

Reinforcement Learning Policy Improvement - Creates action weights from search results to iteratively refine agent strategies and policies.

Batched State Transitions - Implements batch-parallel state transitions to update multiple simulated environments using a single compiled function call.

Compiled Search Engines - Provides a high-performance engine that runs parallel search operations using compiled functions for efficient batch processing.

Policy Distillation - Generates action weights from search results to perform policy distillation for neural network training.

Monte Carlo Tree Search - Executes high-performance Monte Carlo Tree Search to find optimal action sequences through simulation.

Neural Dynamics Simulators - Ships a system that predicts state transitions and rewards using learned representations to guide decision making.

Model-Based Policy Implementations - Combines representation, dynamics, and prediction functions to propose actions and generate training targets.

Policy Target Generators - Converts search results into action weights used for training and refining neural network policies.

Reinforcement Learning - Integrates tree search and neural predictions within a comprehensive reinforcement learning workflow.

Differentiable Observation Mappings - Provides differentiable mapping of raw environment observations into a latent space for dynamics processing.

Tree Search Frameworks - Offers a framework for executing high-performance tree search and state simulations to generate policy targets.

Vectorised Tree Search - Performs multiple independent search simulations in parallel by leveraging tensor operations instead of scalar loops.

Simulated Environments - Predicts future states and rewards using learned dynamics functions to guide decision making in simulated environments.

Batched Environment Simulators - Simulates neural environments using batched predictors for state transitions, rewards, and values.

High-Performance and Parallel Computing - Utilizes high-performance and parallel computing via JAX and XLA to execute search algorithms across input batches.

Neural Network Training - Refines neural network policy performance by using search results as high-quality training targets.

google-deepmindmctx

Features

Star history