9 Repos
Mechanisms for synchronizing data across distributed nodes to ensure high availability and fault tolerance.
Distinguishing note: Focuses on the replication mechanism itself rather than general database management.
Explore 9 awesome GitHub repositories matching data & databases · Replication Protocols. Refine with filters or upvote what's useful.
Kafka is a distributed event streaming platform designed for capturing, storing, and processing real-time data streams across interconnected nodes. It functions as a distributed commit log, providing a fault-tolerant storage mechanism that records state changes sequentially to ensure data consistency and durability across distributed environments. The platform distinguishes itself through a partitioned commit log architecture that enables horizontal scaling and parallel processing of data streams. It integrates a stream processing engine for continuous transformations and aggregations, while
Maintains high availability by allowing follower nodes to pull data from the leader in the background.
This project is a distributed, document-oriented database system designed to store information in flexible, hierarchical structures. It supports horizontal scaling through automated sharding and maintains high availability across global clusters using a multi-node replication protocol. By executing multi-document operations as atomic units, the system ensures data integrity and consistency across distributed environments. The platform distinguishes itself by integrating advanced vector-based indexing, which enables semantic similarity searches alongside traditional geospatial and lexical quer
Maintains high availability and data durability by synchronizing information across multiple server instances with automatic failover.
This project is a reactive, offline-first NoSQL database engine designed for JavaScript applications. It provides a robust framework for managing application state by synchronizing data across browsers, mobile devices, and server-side runtimes. By treating local storage as the primary source of truth, it enables applications to remain functional without network connectivity, automatically reconciling changes with remote backends once a connection is restored. The database distinguishes itself through a modular architecture that supports cross-environment synchronization and high-performance d
Implements a bidirectional replication protocol with conflict resolution and offline-first state convergence.
NATS Server is a high-performance, lightweight messaging system designed for cloud-native applications, edge computing, and distributed microservices. It functions as a distributed publish-subscribe broker that routes messages using hierarchical, dot-separated subject strings, enabling decoupled communication between services without requiring centralized broker lookups. The system supports core messaging patterns including asynchronous publish-subscribe, request-reply, and load-balanced queue processing. The platform distinguishes itself through a decentralized architecture that eliminates t
Distributes stream data across multiple cluster nodes using consensus protocols to ensure high availability and fault tolerance.
Dat is a peer-to-peer file synchronization tool that combines an append-only, hash-addressed log with Merkle tree verification, cryptographic access keys, live streaming replication, and swarm networking for sparse, versioned file sharing. It stores file data and metadata in a cryptographically signed, versioned append-only log where each entry is identified by its hash, and uses public-key cryptography to secure archives with separate read and write keys. The tool enables live streaming replication of data between peers as entries are appended, with Merkle tree integrity verification that su
Downloads only specific file ranges needed rather than entire archives, enabling efficient partial replication.
Materialize is a streaming SQL database that continuously ingests live data from sources such as Kafka, Redpanda, PostgreSQL, and MySQL, and incrementally maintains materialized views. It provides a PostgreSQL-compatible query engine that accepts standard SQL over the PostgreSQL wire protocol, enabling any existing SQL client or BI tool to query real-time data. The system also includes a Model Context Protocol (MCP) server that exposes live materialized view data to AI agents, providing fresh context without polling. Materialize distinguishes itself through its ability to offer configurable c
Continuously reads upstream database replication logs and maps them to differential dataflow updates.
AliSQL is a fork of MySQL by Alibaba that extends the relational database management system with enhancements for high performance, scalability, and enterprise-grade availability. It retains the core MySQL identity as a SQL-based database for storing, organizing, and retrieving structured data, while adding optimizations for large-scale transactional and analytical workloads. The project differentiates itself through a set of Alibaba-specific improvements, including a columnar engine for accelerating analytical queries directly on MySQL tables, and a distributed, shared-nothing NDB Cluster en
Replicates data synchronously across multiple MySQL servers for consistency and automatic failover.
Kanidm is a centralized identity management server designed to handle authentication, authorization, and directory services across distributed infrastructure. It provides a comprehensive framework for managing human and service accounts, utilizing a schema-driven database to store identity records, group memberships, and system attributes. The platform supports a wide range of authentication methods, including passkeys, passwords, and standard protocols like OAuth2, OIDC, LDAP, and RADIUS. The system distinguishes itself through a granular access control engine that enforces security policies
Coordinates state and configuration updates across distributed nodes using a central authority.
Hypercore is a distributed append-only logging system designed for maintaining cryptographically signed data streams that are replicated and verified across a network of peers. It provides verifiable data storage using a Merkle tree structure to ensure the integrity and authenticity of information through cryptographic proofs. The project is distinguished by its support for sparse data replication, which allows peers to download only the specific ranges or blocks of a log required for their current needs to reduce bandwidth. It also implements encrypted peer-to-peer messaging and the ability
Reduces bandwidth usage by downloading only specific ranges or individual blocks of data instead of the entire log.