Lmnr is an LLM observability platform and evaluation framework designed for tracing, logging, and monitoring language model executions. It provides the tools necessary to debug agent behavior, analyze performance, and identify failure patterns in AI agents.
The platform differentiates itself through a trace-to-dataset pipeline that converts production logs into labeled test sets for regression testing. It includes a prompt-variant replay engine to compare different prompts or models side-by-side and a state-cached debugging system to replay agent loops without restarting the process.
The system covers a broad range of capabilities, including event analysis via natural language extraction, SQL-based observability storage, and the creation of time-synchronized dashboards. It also manages AI datasets with versioning and annotation, provides real-time alerting through external integrations, and supports PII data redaction for privacy compliance.
The software is available as a self-hosted observability stack that can be deployed using container orchestration and cloud provider images.