1 repo
Utilities for investigating production failures and identifying system bottlenecks.
Distinguishing note: Focuses on the diagnostic process for system outages.
Explore 1 awesome GitHub repository matching system administration & monitoring · Root Cause Analysis Tools. Refine with filters or upvote what's useful.
Loki is a horizontally scalable, highly available log aggregation engine designed to store and query massive volumes of unstructured log data. It functions as a distributed observability platform that correlates logs, metrics, and traces to provide comprehensive visibility into the health and performance of complex infrastructure. The system distinguishes itself through a distributed query execution model that processes large datasets in parallel across cluster nodes. It utilizes label-based stream indexing and a distributed index to map log data to specific chunks, enabling rapid retrieval w
Facilitates investigation of production failures by querying historical system data.