30 open-source projects similar to questdb/questdb, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Questdb alternative.
VictoriaMetrics is a high-performance, scalable time series database and observability platform designed for long-term storage and analysis of metric, log, and trace data. It functions as a unified backend for monitoring ecosystems, offering full compatibility with industry-standard protocols and query languages. The system is built to handle massive data volumes through a distributed architecture that supports horizontal scaling and efficient data lifecycle management. The platform distinguishes itself through a storage engine that utilizes consistent hashing for data sharding and log-struct
GreptimeDB is a distributed, open-source time-series database built for unified observability. It stores and queries metrics, logs, and traces together in a single columnar engine, supporting both SQL and PromQL for analysis. The database is designed as a Kubernetes-native operator with a decoupled compute and storage architecture, enabling horizontal scaling and multi-region deployment. What distinguishes GreptimeDB is its role as a multi-protocol ingestion gateway, accepting data through OpenTelemetry, Prometheus Remote Write, InfluxDB, Loki, Elasticsearch, Kafka, and MQTT protocols without
InfluxDB is a high-performance time-series database designed for collecting, storing, and querying time-stamped metrics and event data. It functions as a columnar time-series store and a real-time analytics engine, providing a network-accessible interface for retrieving and analyzing temporal records. The system utilizes a specialized columnar storage format to support high ingestion rates and efficient data retrieval. It incorporates a programmable runtime for executing custom plugins and triggers, including integration for processing and transforming incoming data streams. The platform cov
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
TDengine is a distributed time-series database designed for the high-speed ingestion, compression, and retrieval of timestamped metrics and sensor data. It functions as a SQL-compatible analytics engine, allowing users to perform complex operations on massive volumes of time-ordered information using standard relational syntax. The platform is built to serve as a backend foundation for industrial IoT environments, managing real-time data streams and device metadata through a cluster-based architecture. The system distinguishes itself through a distributed sharding architecture that uses consi
TiDB is a horizontally scalable, distributed SQL database designed to provide consistent transactional storage and high-performance analytical processing within a single unified architecture. It utilizes a decoupled compute-storage design and a distributed key-value storage layer to ensure horizontal scalability and efficient range-based queries. By employing a consensus-based replication algorithm, the system maintains high availability and automatic failover across multiple nodes and geographical regions. The platform distinguishes itself through its hybrid transactional and analytical proc
Redis is a high-performance in-memory key-value store that functions as a distributed cache, message broker, and NoSQL database. It provides sub-millisecond read and write access to data stored in RAM and can operate as a vector database for indexing high-dimensional embeddings. The system supports a wide range of data storage and synchronization primitives, including the management of strings, hashes, lists, sets, and JSON documents. It enables real-time data operations through atomic transactions, hybrid persistence using snapshots and append-only logs, and high-availability configurations
Doris is a distributed SQL data warehouse designed for high-performance analytical workloads and real-time data processing. It functions as a unified platform that integrates traditional relational warehousing with lakehouse query capabilities, allowing users to execute analytical operations directly against external data lakes without requiring data migration. The system distinguishes itself through a shared-nothing, massively parallel processing architecture that utilizes vectorized query execution and columnar storage to maintain sub-second latency. It supports dynamic schema evolution, en
This project is a feature-rich Go client library designed for interacting with Redis. It serves as a comprehensive interface for managing remote data stores, enabling developers to execute standard database commands, handle complex data structures, and perform asynchronous operations within Go applications. The library distinguishes itself through its support for advanced Redis capabilities, including connection pooling, pipelining, and transactional integrity. It provides specialized primitives for managing distributed clusters, including automated topology updates and request routing to sha
InfluxDB is a specialized time series database platform engineered for the high-speed ingestion, compression, and retrieval of timestamped data at scale. It functions as a distributed metrics platform, providing the infrastructure necessary to organize and analyze massive volumes of time-stamped information to identify trends, patterns, and anomalies within complex data streams. The platform distinguishes itself through a functional dataflow engine that utilizes a specialized programming language for complex analytical transformations and automated tasks. This architecture is supported by a p
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
Elasticsearch is a distributed search engine and document store designed for the high-performance indexing and retrieval of massive volumes of unstructured data. It functions as a centralized analytics platform, providing a schema-flexible architecture that organizes information into searchable indices while maintaining global cluster state through a distributed consensus mechanism. The platform distinguishes itself through its integrated approach to observability, security, and advanced analytics. It combines full-text, vector, and hybrid search capabilities with machine learning-driven insi
Apache IoTDB is a time-series database designed for the Internet of Things, purpose-built to ingest high-volume data from millions of low-power devices and store timestamp-value pairs with configurable data types and encoding schemes. It organizes time series data and device metadata in a tree-like hierarchy, enabling efficient management of complex industrial sensor networks. The database supports rich querying capabilities, including time-aligned data retrieval across multiple devices, time-based aggregation like downsampling, and frequency-domain signal analysis. It provides high-throughpu
ClickHouse is a high-performance, columnar analytical database designed for real-time query execution and large-scale data aggregation. It functions as a distributed data warehouse capable of processing petabytes of information, while also providing an embedded engine that integrates directly into applications for native query capabilities without external dependencies. The system is built to handle high-throughput ingestion and complex analytical workloads, delivering millisecond-level latency for interactive dashboards and operational monitoring. The platform distinguishes itself through ad
RisingWave is a cloud-native streaming database and real-time analytics engine that uses standard SQL to process continuous data streams. It functions as a streaming data lakehouse, combining the capabilities of a streaming SQL database with a platform that integrates streaming ingestion with open table formats. The system is distinguished by its use of the PostgreSQL wire protocol, allowing it to integrate with existing SQL tools and drivers. It employs a decoupled compute and storage architecture, persisting streaming state and materialized views in cloud object storage to enable independen
OpenTSDB is a distributed time series database and metrics engine designed for storing and managing massive volumes of high-cardinality system metrics. It functions as a data store and analytics platform that enables large-scale metric ingestion and infrastructure performance monitoring across a distributed cluster. The system distinguishes itself through a distributed storage abstraction that supports multiple backends such as HBase, Cassandra, and Google Bigtable. It utilizes a hierarchical metric tree to organize time series and employs numeric identifier indexing to reduce storage footpri
Rerun is a multimodal data visualizer and robotics data logger designed for rendering synchronized streams of 3D spatial data, images, and time-series metrics. It functions as a tool for capturing high-frequency sensor data and AI outputs into a queryable columnar format, providing a dedicated interface for viewing MCAP recording files and analyzing physical environments. The project distinguishes itself as a machine learning dataset streamer, capable of feeding logged recordings directly into GPU buffers and PyTorch training pipelines without intermediate exports. It supports a high-performa
OpenObserve is a unified observability data platform designed to ingest, store, and analyze logs, metrics, and traces. It functions as a cloud-native monitoring tool that centralizes telemetry from diverse sources, including standard collectors and cloud service providers, into a single, scalable system. By utilizing a columnar storage engine backed by object storage, the platform enables efficient long-term data retention and high-performance analytical querying. The platform distinguishes itself through deep integration with artificial intelligence, allowing users to query data using natura
Badger is an embeddable key-value store written in Go that provides persistent data storage for byte keys and values. It is a persistent database that utilizes a tiered LSM tree storage model to optimize disk storage and retrieval efficiency. The system features an ACID transaction engine that ensures data integrity through serializable snapshot isolation and multi-version concurrency control. It also provides an encrypted key-value store with data-at-rest encryption and a managed encrypted key registry to secure stored information. The engine covers a broad set of capabilities including hig
This project is a comprehensive Java backend engineering guide and technical reference focused on high-concurrency design, distributed systems, and microservices architecture. It provides detailed strategies for decomposing monolithic applications, managing service discovery, and implementing the architectural patterns required for scalable backend environments. The repository distinguishes itself through an extensive collection of big data algorithmic references and database scaling strategies. It covers memory-efficient techniques for analyzing massive datasets, such as Top-K element extrac
Prometheus is a comprehensive monitoring and alerting platform designed to track infrastructure health and application performance. It functions as a time series database that ingests, indexes, and queries high-frequency numerical data points. By utilizing a pull-based model, the system periodically collects multi-dimensional metrics from monitored targets, storing them in an optimized block storage format that supports high-throughput ingestion and efficient historical analysis. The platform distinguishes itself through a specialized query engine that enables real-time analysis of performanc
immudb is a tamperproof database that maintains an immutable record of entries using cryptographic commit logging. It ensures verifiable database integrity by utilizing Merkle trees to generate membership and consistency proofs that detect unauthorized data alterations. The system employs a multi-model storage engine that unifies key-value, document, and relational data structures within a single immutable backend. It provides compatibility with the PostgreSQL wire protocol, allowing it to integrate with standard SQL clients, ORMs, and database tools. The project covers broad capabilities in
The AWS Cloud Development Kit is an infrastructure-as-code framework that enables developers to define and provision cloud resources using familiar programming languages. By utilizing construct-based synthesis, it translates high-level, object-oriented code into declarative templates, allowing for the automated management of complex cloud environments through a centralized, code-driven control plane. The framework distinguishes itself through its ability to model infrastructure as a dependency-aware resource graph, ensuring that components are provisioned and updated in the correct order. It
m3 is a distributed time series database designed for high-resolution metrics and high-cardinality data management. It functions as a scalable storage system and a multi-cluster query engine, providing a distributed metrics aggregator capable of downsampling and summarizing data before it is committed to storage. The project distinguishes itself through a coordinated cluster model using etcd for node membership and shard placement. It supports multiple ingestion protocols, including the Prometheus remote write protocol, InfluxDB line protocol, and Graphite Carbon plaintext protocol, and provi
Druid is a distributed columnar store and online analytical processing database designed for real-time analytics. It functions as a SQL analytics platform and a streaming data ingestion engine, allowing for the analysis of large datasets with low latency to support interactive dashboards and high-concurrency operational workloads. The system integrates a streaming data ingestion engine that loads information via batch or streaming processes to enable immediate analysis of arriving data. It provides high-performance analytical processing to execute slice-and-dice queries on massive data volume
Apache Pulsar is a cloud-native distributed pub-sub messaging system designed for high-performance data ingestion. It functions as a geo-replicated data streamer and a multi-tenant event streaming platform, providing a serverless stream processing engine and a tiered storage messaging broker. The system distinguishes itself by separating serving layers from storage layers to allow independent scaling of compute and data retention. It features native geo-replication to synchronize messages across different geographical regions and employs a multi-layered tenant isolation model using authentica
This project is a reference library of architectural blueprints, study materials, and design patterns for building scalable, high-availability distributed systems. It serves as a technical guide for scalability engineering, providing structural solutions for common engineering challenges. The repository focuses on distributed systems design, covering essential patterns for data replication, consensus algorithms, and transaction management. It distinguishes itself by offering detailed blueprints for specialized domains, including real-time data streaming, large-scale data storage, and high-ava
Thanos is a CNCF cloud native monitoring tool that provides a highly available and scalable extension to the Prometheus ecosystem. It functions as a global query engine, a long-term storage system, and a metric downsampler. The project enables a unified interface to aggregate and query metrics across multiple distributed clusters from a single view. It maintains historical data beyond local retention limits by persisting time-series metrics in object storage and eliminates data gaps by merging metrics from redundant server pairs. The system includes capabilities for reducing the resolution o
ParadeDB is a database extension that integrates full-text search, vector database capabilities, and real-time analytics directly into a relational engine. It functions as a plugin that adds new storage and query execution capabilities to an existing database architecture. The project distinguishes itself by supporting hybrid search workflows that combine lexical keyword matching with dense and sparse vector similarity in a single query. It utilizes reciprocal rank fusion to merge these ranked result sets and employs logical replication to synchronize data from external instances, removing th
Databend is a cloud-native data warehouse and OLAP database designed for large-scale analytics. It functions as a SQL-compliant engine and serverless analytics platform that separates compute from storage to allow for independent scaling. The system integrates vector database capabilities, indexing high-dimensional embeddings to enable semantic, hybrid, and full-text searches across massive datasets. It further distinguishes itself through serverless compute management that automatically scales resources based on demand and shuts them down during idle periods. The platform covers a broad set