Database Internals and Storage Engines

Explore open-source implementations of storage engines, indexing structures, and core database management system architectures.

Find the best repos with AI.We'll search the best matching repositories with AI.

cberner/redb
cberner/redb
4,248View on GitHub
redb is an embedded key-value store and ACID-compliant storage engine. It functions as a persistent storage system for saving and retrieving data as key-value pairs within a tree structure. The engine is built as an MVCC transactional database, utilizing multi-version concurrency control to manage simultaneous reads and writes without blocking. It employs a single-writer multi-reader model to ensure data consistency while allowing multiple threads to access the store. The system provides persistent state management and atomic transaction management to prevent data corruption during crashes.
This is an embedded, ACID-compliant storage engine that provides a clear, modern implementation of B-tree structures, MVCC, and write-ahead logging, making it an excellent reference for studying database internals.
RustACID Transactional CoresACID-CompliantMulti-Version Concurrency Control
View on GitHub4,248
google/leveldb
google/leveldb
39,152View on GitHub
LevelDB is an embedded database library and persistent storage engine that provides a sorted key-value store. It uses a log-structured merge-tree architecture to map byte arrays to values, running directly within a process to provide storage without the need for a separate server process. The system is distinguished by its use of custom comparison functions to define key ordering, enabling efficient range scans and sequenced lookups. It ensures data reliability through atomic batch execution, consistent snapshot generation, and log-based recovery after failures. The engine covers broad capab
LevelDB is a foundational, industry-standard reference implementation of an LSM-tree storage engine that provides clear, low-level examples of write-ahead logging, atomic batching, and persistent storage mechanics.
C++Log-Structured Merge-TreesLog-Structured Merge-TreesWrite-Ahead Logging
View on GitHub39,152
postgres/postgres
postgres/postgres
20,076View on GitHub
PostgreSQL is an object-relational database management system designed for the persistent storage and retrieval of structured information. It functions as an ACID-compliant database server, utilizing standard query language protocols to maintain data consistency and reliability across large-scale application datasets. The system distinguishes itself through an extensible architecture that allows for the definition of custom data types, operators, and indexing methods. It employs multi-version concurrency control to enable simultaneous read and write operations without blocking, supported by a
PostgreSQL is a production-grade, open-source database engine that serves as the definitive reference implementation for core storage mechanics like B-tree indexing, write-ahead logging, and multi-version concurrency control.
CACID Transactional CoresWrite-Ahead Logging
View on GitHub20,076
boltdb/bolt
boltdb/bolt
14,642View on GitHub
Bolt is a single-file embedded key-value store for Go applications. It is an ACID transactional database that organizes data in B+trees on disk to provide efficient sorted key retrieval and range scans. The system uses a memory-mapped model to map the database file directly into the process address space for fast random-access reads. The project distinguishes itself through a multi-version concurrency control architecture that allows multiple simultaneous readers to access a consistent snapshot of data without blocking a writer. It employs a single-writer multi-reader locking model and uses a
Bolt is a production-grade embedded key-value store that serves as an excellent reference implementation for B+tree storage, ACID transactions, and memory-mapped file management.
GoACID Transactional CoresB-TreeMulti-Version Concurrency Control
View on GitHub14,642
lmdb/lmdb
LMDB/lmdb
2,907View on GitHub
LMDB is an embedded key-value storage engine that provides ACID-compliant data persistence. It is a memory-mapped database that utilizes B+ trees to store key-value pairs, ensuring atomicity, consistency, isolation, and durability. The engine maps files directly into the virtual address space to minimize data copying and system calls. This approach enables high-performance local caching and low-latency data access, specifically optimizing for read-heavy database workflows. The system implements a transactional model with copy-on-write versioning and single-writer multi-reader locking. These
LMDB is a high-performance, production-grade embedded storage engine that provides a concrete, low-level implementation of B+ trees, ACID transactions, and concurrency control, making it an excellent reference for studying database internals.
CACID-CompliantB-TreeB+ Tree Indexing
View on GitHub2,907
erikgrinaker/toydb
erikgrinaker/toydb
7,251View on GitHub
ToyDB is a distributed SQL database that provides a system for storing and querying data across multiple nodes. It focuses on maintaining strong consistency and fault tolerance through the implementation of a distributed consensus algorithm. The project distinguishes itself by supporting historical data versioning, enabling time-travel queries to retrieve the state of the database from a specific point in the past. It utilizes multi-version concurrency control to manage ACID transactions and ensure data integrity during concurrent operations. The system covers relational data modeling with t
ToyDB is a purpose-built educational database that implements core storage engine mechanics like LSM trees, write-ahead logging, and ACID-compliant transactions, making it an ideal reference for studying database internals.
RustACID Transactional CoresMulti-Version Concurrency ControlMulti-Version Concurrency Control
View on GitHub7,251
mbdavid/litedb
mbdavid/LiteDB
9,410View on GitHub
LiteDB is a serverless, embedded NoSQL document database for .NET applications. It persists data into a single portable file, functioning as a BSON data store that resides within the application process rather than running as a separate server. The system is ACID compliant, utilizing write-ahead logging to ensure atomic, consistent, isolated, and durable transactions. It includes built-in encryption to provide secure local data storage and protect files on disk from unauthorized access. The project covers object-document mapping to convert classes into document formats, indexed search capabi
LiteDB is a fully functional embedded NoSQL database that provides a practical reference for B-tree indexing, ACID compliance, and write-ahead logging, making it a useful resource for studying the mechanics of a production-grade storage engine.
C#ACID Transactional CoresB-TreeQuery Execution Engines
View on GitHub9,410
facebook/rocksdb
facebook/rocksdb
31,767View on GitHub
RocksDB is a high-performance, embeddable persistent key-value library and storage engine based on Log-Structured Merge-trees. It is designed to provide durable storage for large-scale datasets, integrating directly into applications to manage data on flash and RAM-based hardware. The engine is distinguished by its focus on minimizing read and write amplification through multi-threaded compaction and custom memory allocators. It features specialized optimizations for flash storage, including support for zoned block devices, and provides the ability to extend store behavior via external plugin
RocksDB is a production-grade, embeddable storage engine that serves as a definitive reference implementation for LSM-tree architecture, write-ahead logging, and ACID-compliant transactional operations.
C++Log-Structured Merge-TreesWrite-Ahead Logging
View on GitHub31,767
oceanbase/miniob
oceanbase/miniob
4,318View on GitHub
MiniOB is an open-source educational relational database kernel designed for learning the internals of database systems. It implements a dual-engine storage architecture combining B+ Tree and LSM-Tree, supports SQL parsing and query execution, and provides transactional processing with multi-version concurrency control. The system communicates with clients using the MySQL wire protocol and includes a vector database extension for storing and querying high-dimensional vectors. The project distinguishes itself through its comprehensive coverage of core database concepts in a single, learnable c
MiniOB is a purpose-built educational database kernel that provides a comprehensive reference implementation of B+ trees, LSM trees, write-ahead logging, and ACID-compliant transaction management within a single, accessible codebase.
C++LSM-Tree Key-Value StoresMulti-Version Concurrency ControlWrite-Ahead Logging
View on GitHub4,318
spacejam/sled
spacejam/sled
8,928View on GitHub
Sled is an embedded key-value store and ACID-compliant database designed for high-performance data persistence. It functions as a log-structured storage engine that organizes data using B+ trees to support efficient range queries and prefix scans. The engine implements a zero-copy data store model, utilizing epoch-based reclamation to provide direct references to cached values without memory allocations. It distinguishes itself through a combination of write-ahead logging, page cache optimizations to reduce write amplification on flash storage, and serializable transactions for atomic multi-k
Sled is a high-performance, ACID-compliant embedded storage engine that provides a concrete, production-grade implementation of B+ trees, write-ahead logging, and concurrency control mechanisms useful for studying database internals.
RustACID Transactional CoresB-TreeLog-Structured Merge-Trees
View on GitHub8,928
cockroachdb/pebble
cockroachdb/pebble
5,777View on GitHub
Pebble is an embedded key-value storage engine written in Go, designed as a library that provides durable, write-optimized data persistence directly within applications. It organizes data using a log-structured merge-tree (LSM-tree) structure, where writes are first buffered in an in-memory skiplist memtable and persisted to a write-ahead log before being flushed to block-based SSTable files on disk. The engine supports atomic batch commits, configurable write synchronization, and automatic background compaction that merges and rewrites sorted runs to reclaim space and maintain read performanc
Pebble is a production-grade LSM-tree storage engine that provides a clear, well-documented reference for low-level database mechanics like write-ahead logging, SSTable management, and compaction strategies.
GoLog-Structured Merge-TreesLSM-Tree Key-Value StoresWrite-Ahead Logging
View on GitHub5,777
slatedb/slatedb
slatedb/slatedb
2,730View on GitHub
SlateDB is a cloud-native key-value store and distributed database engine that utilizes a log-structured merge-tree architecture. It serves as a transactional storage layer designed to persist data directly to cloud object storage. The engine differentiates itself by optimizing read performance for remote storage through the use of bloom filters and multi-level block caching. It employs a single-writer multi-reader model and provides the ability to create zero-copy clones via copy-on-write checkpointing. The system supports atomic transactions, range queries, and snapshot-based concurrency c
This is a functional LSM-tree storage engine written in Rust that provides a clear, modern implementation of transactional storage mechanics, making it a valuable reference for studying cloud-native database internals.
RustLog-Structured Merge-TreesLSM-Tree Key-Value Stores
View on GitHub2,730
dgraph-io/badger
dgraph-io/badger
15,666View on GitHub
Badger is an embeddable key-value store written in Go that provides persistent data storage for byte keys and values. It is a persistent database that utilizes a tiered LSM tree storage model to optimize disk storage and retrieval efficiency. The system features an ACID transaction engine that ensures data integrity through serializable snapshot isolation and multi-version concurrency control. It also provides an encrypted key-value store with data-at-rest encryption and a managed encrypted key registry to secure stored information. The engine covers a broad set of capabilities including hig
Badger is a production-grade, embeddable LSM-tree storage engine that provides a clear, high-performance reference for implementing write-ahead logging, ACID transactions, and concurrency control in Go.
GoACID Transactional CoresLog-Structured Merge-TreesMulti-Version Concurrency Control
View on GitHub15,666
mariadb/server
MariaDB/server
7,196View on GitHub
This project is an open source relational database management system and SQL database designed for storing and managing structured data. It functions as a relational database for ensuring consistency and reliability, while also operating as a vector database for storing and querying high-dimensional vector embeddings. The system incorporates a columnar storage engine to optimize analytical query processing and large-scale data aggregation. It further enables vector similarity search, allowing users to find similar items by querying vector embeddings. The software covers a broad capability su
This is a full-scale production relational database management system that includes complex storage engine implementations like InnoDB and columnar engines, providing a comprehensive, albeit highly advanced, reference for database internals and storage mechanics.
C++B-TreeMulti-Version Concurrency ControlWrite-Ahead Logging
View on GitHub7,196
etcd-io/etcd
etcd-io/etcd
51,838View on GitHub
etcd is a distributed, strongly consistent key-value store designed to provide reliable storage for critical system metadata and coordination primitives. It functions as a distributed consensus engine, utilizing a replicated log and leader-based state machine to ensure that all nodes in a cluster maintain a synchronized view of data. By providing atomic operations and linearizable reads and writes, it serves as a foundational component for distributed systems requiring high availability and fault tolerance. The system distinguishes itself through its multi-version concurrency control, which e
While this is a production-grade distributed key-value store rather than a pedagogical project, it serves as a high-quality reference implementation for B-tree storage, write-ahead logging, and concurrency control in a distributed context.
GoB-TreeWrite-Ahead Logs
View on GitHub51,838
etcd-io/bbolt
etcd-io/bbolt
9,573View on GitHub
bbolt is an ACID-compliant embedded key-value store for Go applications. It persists all data in a single memory-mapped file on disk, organizing information using B+ trees to facilitate sorted key iteration and efficient range queries. The project distinguishes itself through a hierarchical data organization model, allowing buckets to be nested within other buckets to create a tree-like structure. It employs a single-writer, multi-reader locking mechanism and copy-on-write transactions to ensure serializable isolation and data integrity. The system includes comprehensive data management capa
This is a production-grade embedded key-value store that serves as an excellent reference implementation for B+ tree architecture, ACID-compliant transactions, and low-level disk-based storage mechanics.
GoACID Transactional CoresB-Tree
View on GitHub9,573
helixdb/helix-db
HelixDB/helix-db
3,830View on GitHub
Helix DB is a distributed graph database and knowledge graph platform that persists nodes and edges on object storage for durable and unlimited scaling. It operates as an ACID-compliant system, ensuring data consistency through serializable snapshot isolation during concurrent operations. The project distinguishes itself by combining a vector search engine and a property graph, utilizing hybrid vector and full-text search to locate entry points for graph traversals. It enables dynamic graph querying through a domain-specific language, allowing complex logic and recursive queries to be execute
Helix DB is a distributed graph database that implements core storage engine concepts like ACID compliance and query execution, serving as a practical example of a modern, cloud-native database architecture.
RustACID Transactional CoresQuery Execution Engines
View on GitHub3,830
rethinkdb/rethinkdb
rethinkdb/rethinkdb
26,996View on GitHub
RethinkDB is a distributed, document-oriented database designed to store and manage JSON-formatted data across scalable clusters. It utilizes a custom log-structured storage engine with B-Tree indexing to ensure high-performance disk I/O and data persistence. The system maintains high availability through automatic sharding and replication, employing a primary-replica voting consensus mechanism to handle node failures and ensure consistent cluster operations. A defining characteristic of the platform is its reactive changefeed engine, which allows applications to subscribe to live data update
RethinkDB is a full-scale distributed document database that provides a practical, production-grade implementation of B-tree indexing and log-structured storage, serving as a robust reference for how these components function in a real-world system.
C++Concurrency Control MechanismsQuery Execution Engines
View on GitHub26,996
redis/redis
redis/redis
74,906View on GitHub
Redis is an in-memory, key-value database designed to provide sub-millisecond latency for read and write operations. It functions as a versatile data platform, serving as a distributed cache, a message broker, a NoSQL document store, and a vector database. The system utilizes an event-driven, single-threaded loop to process requests efficiently, while maintaining data durability through append-only persistence logs and asynchronous snapshotting mechanisms. What distinguishes Redis is its ability to handle complex data structures—including strings, hashes, lists, sets, and sorted sets—alongsid
Redis is a production-grade, in-memory key-value store that serves as a valuable reference for understanding event-driven architectures, append-only persistence logs, and efficient data structure implementation in C.
CWrite-Ahead Logs
View on GitHub74,906
clickhouse/clickhouse
ClickHouse/ClickHouse
48,229View on GitHub
ClickHouse is a high-performance, columnar analytical database designed for real-time query execution and large-scale data aggregation. It functions as a distributed data warehouse capable of processing petabytes of information, while also providing an embedded engine that integrates directly into applications for native query capabilities without external dependencies. The system is built to handle high-throughput ingestion and complex analytical workloads, delivering millisecond-level latency for interactive dashboards and operational monitoring. The platform distinguishes itself through ad
ClickHouse is a production-grade, high-performance analytical database that serves as a sophisticated reference for advanced storage techniques like merge trees and vectorized execution, though it is a complex industrial system rather than a simplified pedagogical implementation.
C++Query Execution Engines
View on GitHub48,229
apple/foundationdb
apple/foundationdb
16,446View on GitHub
FoundationDB is an ACID-compliant distributed transactional key-value store. It functions as a scalable database engine that ensures strict serializability and data consistency across a cluster of servers using a shared-nothing architecture. The system is distinguished by its multi-region replication capabilities, allowing data to be synchronized across different datacenters for high availability and disaster recovery. It utilizes optimistic concurrency control to manage distributed transactions and employs a majority-based coordination system to maintain cluster state. The platform provides
FoundationDB is a production-grade distributed transactional key-value store that serves as a sophisticated reference for understanding distributed ACID compliance, optimistic concurrency control, and cluster-wide coordination.
C++ACID Transactional CoresACID-Compliant
View on GitHub16,446
surrealdb/surrealdb
surrealdb/surrealdb
32,397View on GitHub
SurrealDB is a multi-model database engine designed to store and query document, graph, relational, and vector data within a single ACID-compliant platform. It functions as an AI-native data store, integrating vector search, graph traversal, and machine learning model execution directly into its query layer. By providing a unified declarative query language, the platform eliminates the need for external middleware to synchronize data across different storage models. The platform distinguishes itself through its ability to manage agent memory and complex workflows natively. It allows developer
SurrealDB is a full-featured, production-ready multi-model database rather than a pedagogical reference implementation, but its extensive documentation and architectural transparency make it a valuable resource for studying modern database internals.
RustACID Transactional Cores
View on GitHub32,397
cockroachdb/cockroach
cockroachdb/cockroach
32,207View on GitHub
Cockroach is a distributed SQL database designed to scale horizontally across multiple nodes while maintaining strict ACID compliance and global data consistency. It functions as a relational database engine that automatically partitions data into ranges, rebalancing them across a cluster to accommodate growing storage and throughput requirements. By utilizing a distributed consensus protocol, the system ensures that all nodes agree on the order of operations, providing fault tolerance and continuous availability even in the event of hardware failures. The system distinguishes itself through
This is a production-grade distributed SQL database that serves as a sophisticated reference for complex storage engine concepts like multi-version concurrency control, distributed consensus, and layered storage architectures, though it is a full-scale system rather than a simplified pedagogical implementation.
GoDistributed Relational DatabasesDistributed SQL DatabasesDistributed SQL Engines
View on GitHub32,207
pingcap/tidb
pingcap/tidb
40,166View on GitHub
TiDB is a horizontally scalable, distributed SQL database designed to provide consistent transactional storage and high-performance analytical processing within a single unified architecture. It utilizes a decoupled compute-storage design and a distributed key-value storage layer to ensure horizontal scalability and efficient range-based queries. By employing a consensus-based replication algorithm, the system maintains high availability and automatic failover across multiple nodes and geographical regions. The platform distinguishes itself through its hybrid transactional and analytical proc
TiDB is a production-grade distributed SQL database that provides a complex, real-world implementation of storage engine concepts like distributed key-value layers and transactional consistency, though it is a full-scale infrastructure platform rather than a simplified pedagogical reference.
GoAnalytical Query EnginesData Manipulation InterfacesDatabase Lifecycle Management
View on GitHub40,166
tikv/tikv
tikv/tikv
16,535View on GitHub
TiKV is a distributed transactional key-value store designed for horizontal scalability and high availability. It functions as a storage engine that maintains massive datasets across a cluster of physical nodes, ensuring that information remains accessible and consistent even when individual hardware components fail. The system utilizes a consensus-based replication model to synchronize data across nodes, ensuring that all replicas agree on the order of operations. It manages data distribution through a sharding mechanism that partitions large datasets into smaller groups, each governed by in
TiKV is a production-grade distributed storage engine that provides a deep, real-world implementation of complex database internals like Raft consensus, multi-version concurrency control, and transactional storage, making it an excellent reference for advanced database architecture.
RustDistributed Key-Value StoresDistributed DatabasesKey-Value
View on GitHub16,535
duckdb/duckdb
duckdb/duckdb
38,805View on GitHub
DuckDB is an in-process analytical database engine designed to run directly within an application process. As a zero-dependency, embedded system, it provides enterprise-grade SQL data processing capabilities without the overhead of managing a dedicated database server. It is built to handle complex analytical and aggregation tasks by storing and retrieving information in columns, allowing for high-performance relational data manipulation. The engine distinguishes itself through a columnar vectorized execution model that maximizes CPU cache efficiency during query operations. It employs adapti
DuckDB is a high-performance, in-process analytical database engine that serves as an excellent reference for modern columnar storage, vectorized execution, and query optimization techniques, even though it focuses on OLAP rather than traditional B-tree or LSM-based transactional storage.
C++Analytical DatabasesColumnar EnginesEmbedded Databases
View on GitHub38,805
valkey-io/valkey
valkey-io/valkey
24,875View on GitHub
Valkey is an in-memory, NoSQL database server designed for high-performance data storage and real-time state management. It operates as a distributed key-value store, maintaining datasets entirely within system memory to facilitate sub-millisecond response times for read and write operations. The system distinguishes itself through a single-threaded event loop that utilizes asynchronous I/O multiplexing to ensure high throughput. It supports high availability via master-replica replication and provides a decoupled communication model through a built-in publish-subscribe messaging pattern. To
Valkey is a high-performance, in-memory key-value store that provides a practical, production-grade reference for understanding event-loop architectures, asynchronous I/O, and snapshot-based persistence mechanisms.
CIn-Memory Data StoresIn-Memory DatabasesKey-Value Stores
View on GitHub24,875
neondatabase/neon
neondatabase/neon
22,251View on GitHub
Neon is a serverless PostgreSQL database platform designed with a decoupled storage and compute architecture. It functions as a multi-tenant system that isolates data and compute resources for independent users on shared cloud infrastructure, utilizing a specialized PostgreSQL storage engine. The platform features a database branching system that allows for the creation of isolated, instant copies of a database for testing and development. It further distinguishes itself with an HTTP-based SQL gateway, enabling the execution of queries via HTTP requests and JSON responses without the need for
Neon is a sophisticated, production-grade storage engine for PostgreSQL that demonstrates advanced concepts like log-structured storage and decoupled compute-storage architecture, serving as a high-level reference for modern cloud-native database internals.
RustServerless DatabasesStorage-Compute ArchitecturesAutoscaling Systems
View on GitHub22,251
rqlite/rqlite
rqlite/rqlite
17,586View on GitHub
rqlite is a distributed relational database that replicates SQLite data across a cluster using the Raft consensus algorithm. It functions as a fault-tolerant storage system that provides high availability and a web API for executing SQL queries and managing relational data without requiring native database drivers. The system distinguishes itself by using an HTTP SQL interface to expose database operations and cluster management. It features a real-time change data capture stream that pushes database mutations to external HTTP endpoints via webhooks and supports the scaling of read throughput
This project serves as a practical reference for implementing distributed consensus and replication on top of an existing storage engine, though it focuses more on cluster coordination than the low-level mechanics of building a storage engine from scratch.
GoDistributed Relational DatabasesRaft Consensus ImplementationsChange Data Capture Streams
View on GitHub17,586
taosdata/tdengine
taosdata/TDengine
24,734View on GitHub
TDengine is a distributed time-series database designed for the high-speed ingestion, compression, and retrieval of timestamped metrics and sensor data. It functions as a SQL-compatible analytics engine, allowing users to perform complex operations on massive volumes of time-ordered information using standard relational syntax. The platform is built to serve as a backend foundation for industrial IoT environments, managing real-time data streams and device metadata through a cluster-based architecture. The system distinguishes itself through a distributed sharding architecture that uses consi
This is a production-grade distributed time-series database rather than a pedagogical reference implementation, but its architecture provides a concrete, high-performance example of log-structured storage and distributed data management for those studying database internals.
CAnalytics EnginesColumnar Storage EnginesTime Series Databases
View on GitHub24,734
dgraph-io/dgraph
dgraph-io/dgraph
21,700View on GitHub
Dgraph is a distributed graph database designed to store and query highly connected data. It organizes information as nodes and edges to represent complex relationships between entities, providing a platform for managing and analyzing deeply linked datasets. The system functions as a horizontally scalable cluster that partitions data across multiple nodes to maintain performance and availability as information volume increases. It utilizes a specialized query language built for low-latency navigation of interconnected data points, allowing for the execution of complex queries across large-sca
Dgraph is a production-grade distributed graph database that provides a complex, real-world implementation of distributed storage, consensus, and transaction processing, serving as a sophisticated reference for how large-scale database engines manage state and concurrency.
GoGraph DatabasesDistributed DatabasesDistributed Databases
View on GitHub21,700

Database Internals and Storage Engines

cberner/redb

google/leveldb

postgres/postgres

boltdb/bolt

LMDB/lmdb

erikgrinaker/toydb

mbdavid/LiteDB

facebook/rocksdb

oceanbase/miniob

spacejam/sled

cockroachdb/pebble

slatedb/slatedb

dgraph-io/badger

MariaDB/server

etcd-io/etcd

etcd-io/bbolt

HelixDB/helix-db

rethinkdb/rethinkdb

redis/redis

ClickHouse/ClickHouse

apple/foundationdb

surrealdb/surrealdb

cockroachdb/cockroach

pingcap/tidb

tikv/tikv

duckdb/duckdb

valkey-io/valkey

neondatabase/neon

rqlite/rqlite

taosdata/TDengine

dgraph-io/dgraph