30 open-source projects similar to openzipkin/zipkin, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Zipkin alternative.
Uptrace is an OpenTelemetry-based observability platform designed to collect, store, and analyze distributed traces, metrics, and logs. It functions as a centralized logging backend, a distributed tracing system, and a metrics engine to monitor application performance and system health. The platform is distinguished by AI-powered operational capabilities, allowing users to query telemetry data and manage monitoring dashboards using natural language. It specifically includes specialized monitoring for generative AI pipelines, tracking token usage and response quality for LLM interactions and r
SkyWalking is a comprehensive observability stack and application performance monitoring platform. It functions as a distributed tracing system and an AI application monitor, providing a centralized suite for collecting and analyzing logs, metrics, and traces to maintain the health of containerized architectures. The platform distinguishes itself through a service topology visualizer that renders interactive maps of infrastructure dependencies and communication patterns. It also includes specialized capabilities for generative AI workflow observation to track the execution flow and performanc
SigNoz is a full-stack observability platform designed to collect, store, and visualize metrics, logs, and distributed traces in a unified environment. It leverages OpenTelemetry-based data collection to ingest telemetry from diverse sources using vendor-neutral protocols, ensuring interoperability across complex microservices architectures. The platform utilizes a high-performance columnar storage engine to enable rapid aggregation and filtering, providing a centralized backend for monitoring application health and performance. What distinguishes the platform is its focus on automated instru
OpenObserve is a unified observability data platform designed to ingest, store, and analyze logs, metrics, and traces. It functions as a cloud-native monitoring tool that centralizes telemetry from diverse sources, including standard collectors and cloud service providers, into a single, scalable system. By utilizing a columnar storage engine backed by object storage, the platform enables efficient long-term data retention and high-performance analytical querying. The platform distinguishes itself through deep integration with artificial intelligence, allowing users to query data using natura
VictoriaMetrics is a high-performance, scalable time series database and observability platform designed for long-term storage and analysis of metric, log, and trace data. It functions as a unified backend for monitoring ecosystems, offering full compatibility with industry-standard protocols and query languages. The system is built to handle massive data volumes through a distributed architecture that supports horizontal scaling and efficient data lifecycle management. The platform distinguishes itself through a storage engine that utilizes consistent hashing for data sharding and log-struct
The OpenTelemetry .NET SDK is a set of libraries used to generate and export traces, metrics, and logs from .NET applications. It functions as an application performance monitoring tool and a distributed tracing implementation, providing the necessary infrastructure to capture system metrics and request paths across microservices. The project includes a zero-code instrumentation library that automatically captures telemetry from popular .NET frameworks without requiring manual changes to source code. It uses a provider-based API abstraction to decouple instrumentation from specific backend im
This project is a comprehensive software observability suite and application performance monitoring platform designed to track runtime errors, performance bottlenecks, and system health. It functions as a centralized diagnostic service that aggregates and categorizes exceptions, providing the infrastructure necessary to visualize complex execution paths across distributed systems and microservices. The platform distinguishes itself through a high-throughput distributed event ingestion pipeline and a columnar storage analytics engine that enables rapid aggregation of large-scale performance me
Jaeger is a distributed tracing platform used for collecting, storing, and visualizing request flows across microservices. It identifies performance bottlenecks and errors by tracking requests as they move through multiple service boundaries. The system includes telemetry collectors, a multi-tenant backend, and a trace visualizer. The platform provides a multi-tenant tracing infrastructure that isolates data and queries by tenant to support shared environments. It supports standardized telemetry ingestion via the OpenTelemetry Protocol over gRPC and HTTP. To manage storage costs and overhead,
SkyWalking is an application performance monitoring system and observability platform designed to collect and analyze metrics, traces, and logs from distributed microservices. It functions as a distributed tracing platform and a telemetry data pipeline that ingests and aggregates observability data from various language agents. The project features an AI-powered anomaly detector that uses machine learning to calculate metric baselines and identify irregular URI patterns. It includes an eBPF performance profiler for diagnosing CPU and network bottlenecks at the kernel level and generates inter
Pinpoint is a distributed application performance monitoring and tracing system. It functions as an application performance monitor and topology visualizer designed to analyze the execution behavior of large-scale distributed applications. The system uses bytecode instrumentation to monitor applications without requiring changes to the original source code. It captures call stacks and request flows across interconnected services to visualize system dependencies and generate real-time architectural maps of communication patterns. The platform covers a broad range of observability capabilities
Pinpoint is a distributed application performance management tool designed to trace requests and monitor metrics across large-scale distributed architectures. It functions as a request tracer, topology mapper, and JVM application monitor, providing a backend capable of collecting and visualizing trace data from OpenTelemetry compatible sources. The system distinguishes itself through a combination of bytecode-based instrumentation via a Java agent and topology-based visualization that renders live maps of service interconnections. It captures execution flow across asynchronous boundaries, suc
OpenTelemetry Go is a framework for generating and collecting distributed traces, metrics, and logs from Go applications. It provides a standardized telemetry instrumentation API for adding observability markers to code and a corresponding SDK for processing and emitting these signals. The project utilizes a configurable observability pipeline to sample and export telemetry data to external backends using the OTLP wire protocol. It features a pluggable export system and a separation between the public API and the SDK implementation, allowing telemetry to be routed to third-party platforms wit
FastMCP is a Python framework designed for building servers that expose functions, resources, and prompts to AI models using the Model Context Protocol. It simplifies the development process by automatically deriving tool metadata, input schemas, and documentation directly from Python function signatures and type hints. The framework provides a unified container for managing these components, allowing developers to build modular applications that integrate seamlessly with AI assistants. The project distinguishes itself through its support for interactive, server-defined user interface compone
Perfetto is a platform for system-level performance tracing and analysis on Linux and Android. It combines a high-throughput trace recorder, a SQL-based query engine, and a browser-based visualizer into a single toolchain. The platform covers CPU scheduling and call-stack profiling, native and Java heap memory allocation tracking, GPU and graphics events, and system-wide counters such as CPU frequency and power consumption. The architecture decouples trace recording from offline analysis, using a compact protobuf format for event encoding and columnar storage for efficient SQL queries. The we
Vector is a high-performance observability data pipeline designed to collect, transform, and route logs, metrics, and traces across distributed infrastructure. It functions as a modular engine that decouples data ingestion from processing and transmission, utilizing a component-based architecture to connect diverse sources to multiple destinations. The project distinguishes itself through a focus on reliability and flow control. It implements backpressure-aware data movement to prevent data loss during traffic spikes and utilizes disk-backed event buffering to ensure durability during network
Mastra is an orchestration framework designed for building, deploying, and managing autonomous AI agents and multi-agent systems. It provides a comprehensive suite of primitives for creating resilient AI applications, including durable workflow orchestration, event-driven agent loops, and semantic memory management. By integrating these core components, the platform enables developers to build complex, multi-step processes that can reason about goals and execute tasks without manual intervention. The framework distinguishes itself through its focus on observability and secure, isolated execut
Quarkus is a Kubernetes-native Java framework designed for building high-performance, memory-efficient applications. It utilizes ahead-of-time native compilation to transform Java code into standalone, optimized binaries that eliminate the need for a virtual machine, enabling rapid startup and reduced memory consumption. By performing code augmentation during the build phase, it shifts heavy processing tasks away from runtime, ensuring that applications are optimized for cloud-native environments. The framework distinguishes itself through a unified approach to reactive and imperative program
Kibana is a browser-based data exploration and visualization platform designed for interacting with information stored in distributed search engines. It serves as a centralized interface for analyzing structured and unstructured data, enabling users to build custom dashboards, generate interactive charts, and map complex datasets to uncover trends and actionable insights. Beyond visualization, the platform functions as a comprehensive management console for infrastructure operations. It provides tools for configuring security policies, managing data indices, and monitoring system health. The
This project is a reference implementation of microservices architectures using the Spring Cloud ecosystem. It provides a set of example services that demonstrate the construction of distributed systems through automated service discovery, dynamic routing, and shared configuration. The implementation covers core distributed patterns, including a service discovery system for tracking network components and an API gateway for orchestrating incoming traffic. It features a centralized configuration manager to propagate application settings across multiple instances and a distributed tracing syste
This project is an application performance monitoring tool and JVM metrics library designed to measure workload behavior and export performance data to external monitoring databases. It serves as an instrumentation toolkit for tracking resource usage and internal runtime behavior within a Java execution environment. The system focuses on application performance measurement and JVM application monitoring, specifically tracking system health and runtime resource analysis to identify bottlenecks and stability issues. It provides a mechanism for external metrics export, sending captured data to t
Encore is a distributed systems framework designed to unify backend development, infrastructure provisioning, and observability. It functions as an infrastructure-as-code platform that allows developers to define cloud resources, databases, and messaging topics directly within their application code. By analyzing these declarations at compile-time, the system automatically manages the deployment of cloud resources and security policies, ensuring parity between local development and production environments. The platform distinguishes itself through its integrated development experience, which
This project is an OpenTelemetry reference implementation and distributed microservices environment used to demonstrate the collection and export of traces, metrics, and logs. It serves as a telemetry pipeline showcase and a polyglot instrumentation example, providing a sandbox for practicing distributed tracing and monitoring within a Kubernetes cluster. The system features a polyglot architecture to demonstrate consistent, vendor-neutral telemetry implementation across multiple programming languages. It includes a simulated environment for testing telemetry interoperability and troubleshoot
GreptimeDB is a distributed, open-source time-series database built for unified observability. It stores and queries metrics, logs, and traces together in a single columnar engine, supporting both SQL and PromQL for analysis. The database is designed as a Kubernetes-native operator with a decoupled compute and storage architecture, enabling horizontal scaling and multi-region deployment. What distinguishes GreptimeDB is its role as a multi-protocol ingestion gateway, accepting data through OpenTelemetry, Prometheus Remote Write, InfluxDB, Loki, Elasticsearch, Kafka, and MQTT protocols without
Cat is a distributed application performance monitoring tool and tracing framework designed to track transactions, latency, and health across distributed services. It functions as a Kubernetes-native monitoring stack that utilizes multi-language monitoring clients and a real-time alerting system to maintain system visibility. The system provides monitoring clients for Java, Go, Python, Node.js, and C++ to collect performance metrics and trace data. It distinguishes itself by sampling request flows to record call chains and identify bottlenecks, while using a monitoring engine to trigger immed
Grafana Tempo is a high-scale distributed tracing backend and columnar trace database. It serves as an observability data store that persists and queries spans and traces using OpenTelemetry standards, allowing for the analysis of request flows across microservices. The system distinguishes itself by using an object-store based backend with columnar Parquet storage. This architecture enables efficient attribute searching and large-scale data retrieval through dedicated attribute columnization and block-based data partitioning. It includes a specialized TraceQL query engine for filtering trace
This project is a service mesh platform designed to manage, secure, and observe service-to-service communication within Kubernetes clusters. It functions as a control plane that orchestrates transparent sidecar proxies, which intercept and manage network traffic to provide reliable connectivity for microservices. By automating the injection of these proxies, the platform ensures that infrastructure-level policies are applied consistently across all workloads without requiring manual configuration changes. The platform distinguishes itself through its focus on zero-trust security and cross-clu
Dapr is a distributed application runtime that provides a sidecar-based infrastructure layer for building resilient microservices and event-driven applications. By utilizing a sidecar proxy pattern, it abstracts complex infrastructure tasks into standardized, network-accessible APIs, allowing developers to focus on application logic while the runtime handles service discovery, state management, and secure communication. The platform distinguishes itself through a pluggable component architecture and language-agnostic design, enabling services written in any programming language to interact wi
mcp-agent is a framework for building AI agents that integrate with Model Context Protocol servers to execute tools and access data. It functions as a multi-agent orchestrator and protocol-compliant server, enabling the creation of agents that can discover and invoke tools from connected external servers. The project distinguishes itself through a durable workflow engine that supports long-running tasks capable of pausing, resuming, and surviving restarts. It implements complex orchestration patterns, including iterative evaluator-optimizer loops, hierarchical workflow nesting, and specialist
This project is a structured tracing framework for Rust that serves as an async-aware instrumentation library and telemetry data collector. It provides a structured logging facade and the tools necessary to record, filter, and route event-based diagnostic data from both standard applications and embedded systems. The framework distinguishes itself through a core implementation that supports bare-metal and no-standard-library environments without requiring a dynamic memory allocator. It specifically handles the complexities of asynchronous workflows by propagating diagnostic contexts across fu