7 Repos
Database engines specifically tuned for analytical workloads.
Distinguishing note: Focuses on the analytical workload profile rather than general-purpose storage.
Explore 7 awesome GitHub repositories matching data & databases · Analytical Databases. Refine with filters or upvote what's useful.
DuckDB is an in-process analytical database engine designed to run directly within an application process. As a zero-dependency, embedded system, it provides enterprise-grade SQL data processing capabilities without the overhead of managing a dedicated database server. It is built to handle complex analytical and aggregation tasks by storing and retrieving information in columns, allowing for high-performance relational data manipulation. The engine distinguishes itself through a columnar vectorized execution model that maximizes CPU cache efficiency during query operations. It employs adapti
Designed for rapid analytical querying of large datasets within an application process.
Dokploy is a self-hosted platform-as-a-service designed to simplify the deployment and management of containerized applications and databases. It provides a centralized control plane that decouples administrative management from application workloads, allowing users to oversee infrastructure across multiple server nodes through a unified web interface or a command-line tool. The platform distinguishes itself through an extensive library of pre-configured application templates, enabling the rapid deployment of databases, identity providers, and various productivity or development tools. It sup
Provides a high-performance database for real-time analytical queries.
SigNoz is a full-stack observability platform designed to collect, store, and visualize metrics, logs, and distributed traces in a unified environment. It leverages OpenTelemetry-based data collection to ingest telemetry from diverse sources using vendor-neutral protocols, ensuring interoperability across complex microservices architectures. The platform utilizes a high-performance columnar storage engine to enable rapid aggregation and filtering, providing a centralized backend for monitoring application health and performance. What distinguishes the platform is its focus on automated instru
High-performance analytical databases store telemetry data in columnar format to enable rapid aggregation and filtering across massive datasets.
Cube is a semantic data layer that provides a unified framework for defining business metrics, dimensions, and relationships across diverse data sources. By acting as a headless business intelligence engine, it transforms raw data into a governed model that can be queried via SQL, REST, and GraphQL interfaces. This architecture ensures consistent data definitions and logic across all downstream analytical applications and reporting tools. The platform distinguishes itself through its integrated conversational AI capabilities, which allow users to explore data using natural language. It orches
Integrates with various analytical database engines to enable semantic modeling and querying.
Druid is a distributed columnar store and online analytical processing database designed for real-time analytics. It functions as a SQL analytics platform and a streaming data ingestion engine, allowing for the analysis of large datasets with low latency to support interactive dashboards and high-concurrency operational workloads. The system integrates a streaming data ingestion engine that loads information via batch or streaming processes to enable immediate analysis of arriving data. It provides high-performance analytical processing to execute slice-and-dice queries on massive data volume
Provides an analytical database engine tuned for high-concurrency operational workloads and complex queries.
MySQL Server is a relational database management system designed to organize and store structured information. It functions as a comprehensive SQL server platform that provides reliable transactional integrity and high-performance query execution for enterprise data management. The system distinguishes itself through a pluggable storage engine architecture that decouples logical query processing from physical data storage, allowing for specialized handling of diverse workloads. It maintains data consistency and high concurrency through multi-version concurrency control and write-ahead logging
Integrates machine learning pipelines directly into the storage layer to perform model training and analysis on stored information.
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Provides a horizontally scalable system that processes complex SQL queries across large-scale data segments with millisecond response times.