1 repo
Strategies for partitioning and distributing data across multiple nodes to improve scalability and performance.
Distinguishing note: Specifically addresses range-based partitioning of data based on primary keys.
Explore 1 awesome GitHub repository matching data & databases · Data Sharding. Refine with filters or upvote what's useful.
RethinkDB is a distributed, document-oriented database designed to store and manage JSON-formatted data across scalable clusters. It utilizes a custom log-structured storage engine with B-Tree indexing to ensure high-performance disk I/O and data persistence. The system maintains high availability through automatic sharding and replication, employing a primary-replica voting consensus mechanism to handle node failures and ensure consistent cluster operations. A defining characteristic of the platform is its reactive changefeed engine, which allows applications to subscribe to live data update
RethinkDB partitions data across cluster nodes by distributing document ranges based on primary keys to ensure balanced storage and parallelized query execution.