Rlm

rlm is an LLM code execution engine and orchestration framework designed to coordinate multiple language model calls and recursive sub-tasks through a programmable environment. It provides a sandboxed REPL environment and a recursive context processor to handle inputs that exceed standard token limits by programmatically decomposing prompts.

The project differentiates itself through a reinforcement learning training harness used to teach models how to utilize recursive calls and code execution. It includes a reasoning visualization system that records and renders execution trajectories to analyze how models decompose and solve complex tasks.

The system supports secure code execution via pluggable backends, including cloud virtual machines, isolated containers, and local processes. It manages state across multiple turns using a REPL-based environment and allows for the injection of custom tools and external functions into the execution flow.

The framework is implemented in Python.

Features

Code Execution Engines - Provides a dedicated engine for managing the lifecycle and execution of code snippets generated by language models.

LLM Orchestrators - Orchestrates the workflow between various language models and external tools, including interactive IPython shells.

Model Backend Registrations - Configuring a list of distinct language models to execute specific tasks within a single automated workflow.

Provider-Agnostic Model Interfaces - Standardizes connections to various language model backends using a unified, provider-agnostic API and configuration.

Recursive Language Models - Implements recursive execution to explore and process large input contexts that exceed standard token limits.

Recursive Task Decomposers - A tool for handling near-infinite input lengths by programmatically decomposing prompts and executing recursive model calls.

Reinforcement Learning Training Pipelines - Provides a reinforcement learning training harness that records execution trajectories to teach models recursive task decomposition.

Code Execution Environments - Runs model-generated code in isolated sandboxes or containers to safely perform computations and data processing.

Sandboxed Code Execution Environments - Executes generated code within a sandboxed REPL environment to manipulate data and maintain state across operations.

Code Sandboxing Environments - Provides isolated environments using containers and virtual machines to securely execute untrusted model-generated code.

Execution Backends - Implements pluggable execution backends that switch between local processes, containers, and cloud virtual machines based on security needs.

Reasoning Chain Visualizers - Records and renders execution trajectories to visualize how models decompose and solve complex tasks.

Code Execution Environments - Uses a programmable REPL environment to read, transform, and partition large context variables via executed code.

Long Context Processing - Divides massive input contexts into smaller subsets and maps recursive calls across them for processing.

Tool-Use Integrations - Allows injecting external functions and data as callable tools for the language model to utilize during execution.

Execution State Persistence - Stores prompts and intermediate results as variables in a REPL to allow data modification across multiple turns.

Container Execution - Provides a secure, reproducible environment with a private filesystem by running code within isolated containers.

Custom Container Images - Uses custom container images to define specific language versions and pre-installed libraries for reproducible code execution.

Code Execution Sandboxes - Provides low-overhead execution for trusted tasks by running code within the host process using a restricted namespace.

Execution Sandboxes - Runs untrusted code in ephemeral cloud virtual machines to secure the host system.

REPL Workspace Management - Maintains a persistent REPL workspace where variables and intermediate results survive across multiple model interaction turns.

Process Isolation - Ensures full namespace isolation by running generated code in separate subprocess kernels.

Subprocess-Based Isolation - Runs generated code in separate subprocesses to ensure namespace isolation and strictly enforce execution timeouts.

Custom Execution Environment Definitions - Allows the definition of custom sandbox templates via Dockerfiles to support specialized language versions and dependencies.

Agent Trajectory Logs - Records prompts, generated code, and execution results into structured files to analyze model task decomposition.

Trajectory Visualizers - Renders recorded execution trajectories into a visual interface to inspect the flow of code and model interactions.

alexzhang13rlm

Features

Star history