Explore open-source implementations of storage engines, indexing structures, and core database management system architectures.
redb is an embedded key-value store and ACID-compliant storage engine. It functions as a persistent storage system for saving and retrieving data as key-value pairs within a tree structure. The engine is built as an MVCC transactional database, utilizing multi-version concurrency control to manage simultaneous reads and writes without blocking. It employs a single-writer multi-reader model to ensure data consistency while allowing multiple threads to access the store. The system provides persistent state management and atomic transaction management to prevent data corruption during crashes.
This is an embedded, ACID-compliant storage engine that provides a clear, modern implementation of B-tree structures, MVCC, and write-ahead logging, making it an excellent reference for studying database internals.
LevelDB is an embedded database library and persistent storage engine that provides a sorted key-value store. It uses a log-structured merge-tree architecture to map byte arrays to values, running directly within a process to provide storage without the need for a separate server process. The system is distinguished by its use of custom comparison functions to define key ordering, enabling efficient range scans and sequenced lookups. It ensures data reliability through atomic batch execution, consistent snapshot generation, and log-based recovery after failures. The engine covers broad capab
LevelDB is a foundational, industry-standard reference implementation of an LSM-tree storage engine that provides clear, low-level examples of write-ahead logging, atomic batching, and persistent storage mechanics.
PostgreSQL is an object-relational database management system designed for the persistent storage and retrieval of structured information. It functions as an ACID-compliant database server, utilizing standard query language protocols to maintain data consistency and reliability across large-scale application datasets. The system distinguishes itself through an extensible architecture that allows for the definition of custom data types, operators, and indexing methods. It employs multi-version concurrency control to enable simultaneous read and write operations without blocking, supported by a
PostgreSQL is a production-grade, open-source database engine that serves as the definitive reference implementation for core storage mechanics like B-tree indexing, write-ahead logging, and multi-version concurrency control.
Bolt is a single-file embedded key-value store for Go applications. It is an ACID transactional database that organizes data in B+trees on disk to provide efficient sorted key retrieval and range scans. The system uses a memory-mapped model to map the database file directly into the process address space for fast random-access reads. The project distinguishes itself through a multi-version concurrency control architecture that allows multiple simultaneous readers to access a consistent snapshot of data without blocking a writer. It employs a single-writer multi-reader locking model and uses a
Bolt is a production-grade embedded key-value store that serves as an excellent reference implementation for B+tree storage, ACID transactions, and memory-mapped file management.
LMDB is an embedded key-value storage engine that provides ACID-compliant data persistence. It is a memory-mapped database that utilizes B+ trees to store key-value pairs, ensuring atomicity, consistency, isolation, and durability. The engine maps files directly into the virtual address space to minimize data copying and system calls. This approach enables high-performance local caching and low-latency data access, specifically optimizing for read-heavy database workflows. The system implements a transactional model with copy-on-write versioning and single-writer multi-reader locking. These
LMDB is a high-performance, production-grade embedded storage engine that provides a concrete, low-level implementation of B+ trees, ACID transactions, and concurrency control, making it an excellent reference for studying database internals.
ToyDB is a distributed SQL database that provides a system for storing and querying data across multiple nodes. It focuses on maintaining strong consistency and fault tolerance through the implementation of a distributed consensus algorithm. The project distinguishes itself by supporting historical data versioning, enabling time-travel queries to retrieve the state of the database from a specific point in the past. It utilizes multi-version concurrency control to manage ACID transactions and ensure data integrity during concurrent operations. The system covers relational data modeling with t
ToyDB is a purpose-built educational database that implements core storage engine mechanics like LSM trees, write-ahead logging, and ACID-compliant transactions, making it an ideal reference for studying database internals.
LiteDB is a serverless, embedded NoSQL document database for .NET applications. It persists data into a single portable file, functioning as a BSON data store that resides within the application process rather than running as a separate server. The system is ACID compliant, utilizing write-ahead logging to ensure atomic, consistent, isolated, and durable transactions. It includes built-in encryption to provide secure local data storage and protect files on disk from unauthorized access. The project covers object-document mapping to convert classes into document formats, indexed search capabi
LiteDB is a fully functional embedded NoSQL database that provides a practical reference for B-tree indexing, ACID compliance, and write-ahead logging, making it a useful resource for studying the mechanics of a production-grade storage engine.
RocksDB is a high-performance, embeddable persistent key-value library and storage engine based on Log-Structured Merge-trees. It is designed to provide durable storage for large-scale datasets, integrating directly into applications to manage data on flash and RAM-based hardware. The engine is distinguished by its focus on minimizing read and write amplification through multi-threaded compaction and custom memory allocators. It features specialized optimizations for flash storage, including support for zoned block devices, and provides the ability to extend store behavior via external plugin
RocksDB is a production-grade, embeddable storage engine that serves as a definitive reference implementation for LSM-tree architecture, write-ahead logging, and ACID-compliant transactional operations.
MiniOB is an open-source educational relational database kernel designed for learning the internals of database systems. It implements a dual-engine storage architecture combining B+ Tree and LSM-Tree, supports SQL parsing and query execution, and provides transactional processing with multi-version concurrency control. The system communicates with clients using the MySQL wire protocol and includes a vector database extension for storing and querying high-dimensional vectors. The project distinguishes itself through its comprehensive coverage of core database concepts in a single, learnable c
MiniOB is a purpose-built educational database kernel that provides a comprehensive reference implementation of B+ trees, LSM trees, write-ahead logging, and ACID-compliant transaction management within a single, accessible codebase.
Sled is an embedded key-value store and ACID-compliant database designed for high-performance data persistence. It functions as a log-structured storage engine that organizes data using B+ trees to support efficient range queries and prefix scans. The engine implements a zero-copy data store model, utilizing epoch-based reclamation to provide direct references to cached values without memory allocations. It distinguishes itself through a combination of write-ahead logging, page cache optimizations to reduce write amplification on flash storage, and serializable transactions for atomic multi-k
Sled is a high-performance, ACID-compliant embedded storage engine that provides a concrete, production-grade implementation of B+ trees, write-ahead logging, and concurrency control mechanisms useful for studying database internals.
Pebble is an embedded key-value storage engine written in Go, designed as a library that provides durable, write-optimized data persistence directly within applications. It organizes data using a log-structured merge-tree (LSM-tree) structure, where writes are first buffered in an in-memory skiplist memtable and persisted to a write-ahead log before being flushed to block-based SSTable files on disk. The engine supports atomic batch commits, configurable write synchronization, and automatic background compaction that merges and rewrites sorted runs to reclaim space and maintain read performanc
Pebble is a production-grade LSM-tree storage engine that provides a clear, well-documented reference for low-level database mechanics like write-ahead logging, SSTable management, and compaction strategies.
SlateDB is a cloud-native key-value store and distributed database engine that utilizes a log-structured merge-tree architecture. It serves as a transactional storage layer designed to persist data directly to cloud object storage. The engine differentiates itself by optimizing read performance for remote storage through the use of bloom filters and multi-level block caching. It employs a single-writer multi-reader model and provides the ability to create zero-copy clones via copy-on-write checkpointing. The system supports atomic transactions, range queries, and snapshot-based concurrency c
This is a functional LSM-tree storage engine written in Rust that provides a clear, modern implementation of transactional storage mechanics, making it a valuable reference for studying cloud-native database internals.
Badger is an embeddable key-value store written in Go that provides persistent data storage for byte keys and values. It is a persistent database that utilizes a tiered LSM tree storage model to optimize disk storage and retrieval efficiency. The system features an ACID transaction engine that ensures data integrity through serializable snapshot isolation and multi-version concurrency control. It also provides an encrypted key-value store with data-at-rest encryption and a managed encrypted key registry to secure stored information. The engine covers a broad set of capabilities including hig
Badger is a production-grade, embeddable LSM-tree storage engine that provides a clear, high-performance reference for implementing write-ahead logging, ACID transactions, and concurrency control in Go.
This project is an open source relational database management system and SQL database designed for storing and managing structured data. It functions as a relational database for ensuring consistency and reliability, while also operating as a vector database for storing and querying high-dimensional vector embeddings. The system incorporates a columnar storage engine to optimize analytical query processing and large-scale data aggregation. It further enables vector similarity search, allowing users to find similar items by querying vector embeddings. The software covers a broad capability su
This is a full-scale production relational database management system that includes complex storage engine implementations like InnoDB and columnar engines, providing a comprehensive, albeit highly advanced, reference for database internals and storage mechanics.
etcd is a distributed, strongly consistent key-value store designed to provide reliable storage for critical system metadata and coordination primitives. It functions as a distributed consensus engine, utilizing a replicated log and leader-based state machine to ensure that all nodes in a cluster maintain a synchronized view of data. By providing atomic operations and linearizable reads and writes, it serves as a foundational component for distributed systems requiring high availability and fault tolerance. The system distinguishes itself through its multi-version concurrency control, which e
While this is a production-grade distributed key-value store rather than a pedagogical project, it serves as a high-quality reference implementation for B-tree storage, write-ahead logging, and concurrency control in a distributed context.
bbolt is an ACID-compliant embedded key-value store for Go applications. It persists all data in a single memory-mapped file on disk, organizing information using B+ trees to facilitate sorted key iteration and efficient range queries. The project distinguishes itself through a hierarchical data organization model, allowing buckets to be nested within other buckets to create a tree-like structure. It employs a single-writer, multi-reader locking mechanism and copy-on-write transactions to ensure serializable isolation and data integrity. The system includes comprehensive data management capa
This is a production-grade embedded key-value store that serves as an excellent reference implementation for B+ tree architecture, ACID-compliant transactions, and low-level disk-based storage mechanics.
Helix DB is a distributed graph database and knowledge graph platform that persists nodes and edges on object storage for durable and unlimited scaling. It operates as an ACID-compliant system, ensuring data consistency through serializable snapshot isolation during concurrent operations. The project distinguishes itself by combining a vector search engine and a property graph, utilizing hybrid vector and full-text search to locate entry points for graph traversals. It enables dynamic graph querying through a domain-specific language, allowing complex logic and recursive queries to be execute
Helix DB is a distributed graph database that implements core storage engine concepts like ACID compliance and query execution, serving as a practical example of a modern, cloud-native database architecture.
RethinkDB is a distributed, document-oriented database designed to store and manage JSON-formatted data across scalable clusters. It utilizes a custom log-structured storage engine with B-Tree indexing to ensure high-performance disk I/O and data persistence. The system maintains high availability through automatic sharding and replication, employing a primary-replica voting consensus mechanism to handle node failures and ensure consistent cluster operations. A defining characteristic of the platform is its reactive changefeed engine, which allows applications to subscribe to live data update
RethinkDB is a full-scale distributed document database that provides a practical, production-grade implementation of B-tree indexing and log-structured storage, serving as a robust reference for how these components function in a real-world system.
Redis is an in-memory, key-value database designed to provide sub-millisecond latency for read and write operations. It functions as a versatile data platform, serving as a distributed cache, a message broker, a NoSQL document store, and a vector database. The system utilizes an event-driven, single-threaded loop to process requests efficiently, while maintaining data durability through append-only persistence logs and asynchronous snapshotting mechanisms. What distinguishes Redis is its ability to handle complex data structures—including strings, hashes, lists, sets, and sorted sets—alongsid
Redis is a production-grade, in-memory key-value store that serves as a valuable reference for understanding event-driven architectures, append-only persistence logs, and efficient data structure implementation in C.
ClickHouse is a high-performance, columnar analytical database designed for real-time query execution and large-scale data aggregation. It functions as a distributed data warehouse capable of processing petabytes of information, while also providing an embedded engine that integrates directly into applications for native query capabilities without external dependencies. The system is built to handle high-throughput ingestion and complex analytical workloads, delivering millisecond-level latency for interactive dashboards and operational monitoring. The platform distinguishes itself through ad
ClickHouse is a production-grade, high-performance analytical database that serves as a sophisticated reference for advanced storage techniques like merge trees and vectorized execution, though it is a complex industrial system rather than a simplified pedagogical implementation.
FoundationDB is an ACID-compliant distributed transactional key-value store. It functions as a scalable database engine that ensures strict serializability and data consistency across a cluster of servers using a shared-nothing architecture. The system is distinguished by its multi-region replication capabilities, allowing data to be synchronized across different datacenters for high availability and disaster recovery. It utilizes optimistic concurrency control to manage distributed transactions and employs a majority-based coordination system to maintain cluster state. The platform provides
FoundationDB is a production-grade distributed transactional key-value store that serves as a sophisticated reference for understanding distributed ACID compliance, optimistic concurrency control, and cluster-wide coordination.
SurrealDB is a multi-model database engine designed to store and query document, graph, relational, and vector data within a single ACID-compliant platform. It functions as an AI-native data store, integrating vector search, graph traversal, and machine learning model execution directly into its query layer. By providing a unified declarative query language, the platform eliminates the need for external middleware to synchronize data across different storage models. The platform distinguishes itself through its ability to manage agent memory and complex workflows natively. It allows developer
SurrealDB is a full-featured, production-ready multi-model database rather than a pedagogical reference implementation, but its extensive documentation and architectural transparency make it a valuable resource for studying modern database internals.
Cockroach is a distributed SQL database designed to scale horizontally across multiple nodes while maintaining strict ACID compliance and global data consistency. It functions as a relational database engine that automatically partitions data into ranges, rebalancing them across a cluster to accommodate growing storage and throughput requirements. By utilizing a distributed consensus protocol, the system ensures that all nodes agree on the order of operations, providing fault tolerance and continuous availability even in the event of hardware failures. The system distinguishes itself through
This is a production-grade distributed SQL database that serves as a sophisticated reference for complex storage engine concepts like multi-version concurrency control, distributed consensus, and layered storage architectures, though it is a full-scale system rather than a simplified pedagogical implementation.
TiDB is a horizontally scalable, distributed SQL database designed to provide consistent transactional storage and high-performance analytical processing within a single unified architecture. It utilizes a decoupled compute-storage design and a distributed key-value storage layer to ensure horizontal scalability and efficient range-based queries. By employing a consensus-based replication algorithm, the system maintains high availability and automatic failover across multiple nodes and geographical regions. The platform distinguishes itself through its hybrid transactional and analytical proc
TiDB is a production-grade distributed SQL database that provides a complex, real-world implementation of storage engine concepts like distributed key-value layers and transactional consistency, though it is a full-scale infrastructure platform rather than a simplified pedagogical reference.
TiKV is a distributed transactional key-value store designed for horizontal scalability and high availability. It functions as a storage engine that maintains massive datasets across a cluster of physical nodes, ensuring that information remains accessible and consistent even when individual hardware components fail. The system utilizes a consensus-based replication model to synchronize data across nodes, ensuring that all replicas agree on the order of operations. It manages data distribution through a sharding mechanism that partitions large datasets into smaller groups, each governed by in
TiKV is a production-grade distributed storage engine that provides a deep, real-world implementation of complex database internals like Raft consensus, multi-version concurrency control, and transactional storage, making it an excellent reference for advanced database architecture.
DuckDB is an in-process analytical database engine designed to run directly within an application process. As a zero-dependency, embedded system, it provides enterprise-grade SQL data processing capabilities without the overhead of managing a dedicated database server. It is built to handle complex analytical and aggregation tasks by storing and retrieving information in columns, allowing for high-performance relational data manipulation. The engine distinguishes itself through a columnar vectorized execution model that maximizes CPU cache efficiency during query operations. It employs adapti
DuckDB is a high-performance, in-process analytical database engine that serves as an excellent reference for modern columnar storage, vectorized execution, and query optimization techniques, even though it focuses on OLAP rather than traditional B-tree or LSM-based transactional storage.
Valkey is an in-memory, NoSQL database server designed for high-performance data storage and real-time state management. It operates as a distributed key-value store, maintaining datasets entirely within system memory to facilitate sub-millisecond response times for read and write operations. The system distinguishes itself through a single-threaded event loop that utilizes asynchronous I/O multiplexing to ensure high throughput. It supports high availability via master-replica replication and provides a decoupled communication model through a built-in publish-subscribe messaging pattern. To
Valkey is a high-performance, in-memory key-value store that provides a practical, production-grade reference for understanding event-loop architectures, asynchronous I/O, and snapshot-based persistence mechanisms.
Neon is a serverless PostgreSQL database platform designed with a decoupled storage and compute architecture. It functions as a multi-tenant system that isolates data and compute resources for independent users on shared cloud infrastructure, utilizing a specialized PostgreSQL storage engine. The platform features a database branching system that allows for the creation of isolated, instant copies of a database for testing and development. It further distinguishes itself with an HTTP-based SQL gateway, enabling the execution of queries via HTTP requests and JSON responses without the need for
Neon is a sophisticated, production-grade storage engine for PostgreSQL that demonstrates advanced concepts like log-structured storage and decoupled compute-storage architecture, serving as a high-level reference for modern cloud-native database internals.
rqlite is a distributed relational database that replicates SQLite data across a cluster using the Raft consensus algorithm. It functions as a fault-tolerant storage system that provides high availability and a web API for executing SQL queries and managing relational data without requiring native database drivers. The system distinguishes itself by using an HTTP SQL interface to expose database operations and cluster management. It features a real-time change data capture stream that pushes database mutations to external HTTP endpoints via webhooks and supports the scaling of read throughput
This project serves as a practical reference for implementing distributed consensus and replication on top of an existing storage engine, though it focuses more on cluster coordination than the low-level mechanics of building a storage engine from scratch.
TDengine is a distributed time-series database designed for the high-speed ingestion, compression, and retrieval of timestamped metrics and sensor data. It functions as a SQL-compatible analytics engine, allowing users to perform complex operations on massive volumes of time-ordered information using standard relational syntax. The platform is built to serve as a backend foundation for industrial IoT environments, managing real-time data streams and device metadata through a cluster-based architecture. The system distinguishes itself through a distributed sharding architecture that uses consi
This is a production-grade distributed time-series database rather than a pedagogical reference implementation, but its architecture provides a concrete, high-performance example of log-structured storage and distributed data management for those studying database internals.
Dgraph is a distributed graph database designed to store and query highly connected data. It organizes information as nodes and edges to represent complex relationships between entities, providing a platform for managing and analyzing deeply linked datasets. The system functions as a horizontally scalable cluster that partitions data across multiple nodes to maintain performance and availability as information volume increases. It utilizes a specialized query language built for low-latency navigation of interconnected data points, allowing for the execution of complex queries across large-sca
Dgraph is a production-grade distributed graph database that provides a complex, real-world implementation of distributed storage, consensus, and transaction processing, serving as a sophisticated reference for how large-scale database engines manage state and concurrency.