25 مستودعات
Systems that store and manage data across multiple networked nodes to provide scalability and fault tolerance.
Distinguishing note: None of the candidates matched; this captures the core distributed architecture of the database.
Explore 25 awesome GitHub repositories matching data & databases · Distributed Databases. Refine with filters or upvote what's useful.
TiDB is a horizontally scalable, distributed SQL database designed to provide consistent transactional storage and high-performance analytical processing within a single unified architecture. It utilizes a decoupled compute-storage design and a distributed key-value storage layer to ensure horizontal scalability and efficient range-based queries. By employing a consensus-based replication algorithm, the system maintains high availability and automatic failover across multiple nodes and geographical regions. The platform distinguishes itself through its hybrid transactional and analytical proc
Contribute Copy as Markdown View as Markdown Download PDF Compared with the traditional standalone databases, TiDB has the following advantages: - Has a distributed architecture with flexible and elasti
SurrealDB is a multi-model database engine designed to store and query document, graph, relational, and vector data within a single ACID-compliant platform. It functions as an AI-native data store, integrating vector search, graph traversal, and machine learning model execution directly into its query layer. By providing a unified declarative query language, the platform eliminates the need for external middleware to synchronize data across different storage models. The platform distinguishes itself through its ability to manage agent memory and complex workflows natively. It allows developer
Operates across embedded, edge, and cloud environments using a consistent binary and API.
This project is a distributed, document-oriented database system designed to store information in flexible, hierarchical structures. It supports horizontal scaling through automated sharding and maintains high availability across global clusters using a multi-node replication protocol. By executing multi-document operations as atomic units, the system ensures data integrity and consistency across distributed environments. The platform distinguishes itself by integrating advanced vector-based indexing, which enables semantic similarity searches alongside traditional geospatial and lexical quer
Distributes data across multiple nodes and regions to ensure horizontal scalability and high availability.
Valkey is an in-memory, NoSQL database server designed for high-performance data storage and real-time state management. It operates as a distributed key-value store, maintaining datasets entirely within system memory to facilitate sub-millisecond response times for read and write operations. The system distinguishes itself through a single-threaded event loop that utilizes asynchronous I/O multiplexing to ensure high throughput. It supports high availability via master-replica replication and provides a decoupled communication model through a built-in publish-subscribe messaging pattern. To
Provides distributed key-value storage for horizontal scalability.
TDengine is a distributed time-series database designed for the high-speed ingestion, compression, and retrieval of timestamped metrics and sensor data. It functions as a SQL-compatible analytics engine, allowing users to perform complex operations on massive volumes of time-ordered information using standard relational syntax. The platform is built to serve as a backend foundation for industrial IoT environments, managing real-time data streams and device metadata through a cluster-based architecture. The system distinguishes itself through a distributed sharding architecture that uses consi
Provides a cluster-based architecture for high availability and fault tolerance.
Dolt is a relational database engine that integrates version control directly into the database management layer. It functions as a version-controlled SQL database that tracks every row and schema change using a commit-based history, allowing users to branch, merge, and audit data modifications. By implementing a wire-protocol-compatible server, the system enables standard SQL clients and tools to interact with versioned data as if they were connecting to a traditional relational database. The platform distinguishes itself by applying repository-style workflows to data management, including s
Facilitates distributed collaboration and data replication by pushing and pulling database states between remote instances.
Dgraph is a distributed graph database designed to store and query highly connected data. It organizes information as nodes and edges to represent complex relationships between entities, providing a platform for managing and analyzing deeply linked datasets. The system functions as a horizontally scalable cluster that partitions data across multiple nodes to maintain performance and availability as information volume increases. It utilizes a specialized query language built for low-latency navigation of interconnected data points, allowing for the execution of complex queries across large-sca
Distributes data across a cluster to maintain performance and availability as information volume and query load grow.
This project is a pure JavaScript database driver for Node.js that implements the native MySQL binary protocol. It serves as a comprehensive connector for managing persistent network links to MySQL servers, enabling applications to execute queries, manage transactions, and handle complex data operations without requiring external middleware. The driver distinguishes itself through its integrated support for connection pooling and distributed database routing. It maintains managed sets of reusable network sockets to optimize resource usage under high request volumes, while simultaneously provi
Distributes database requests across multiple server nodes to improve performance and ensure high availability through automated connection clustering.
LibSQL is a high-performance, distributed SQL database engine that extends SQLite to support remote network access, edge computing, and real-time synchronization. It functions as an embedded database library that integrates directly into application processes while providing the infrastructure to maintain consistency across multiple geographic regions. The platform distinguishes itself by enabling database interaction over standard HTTP protocols, allowing applications to query remote data sources in serverless and edge environments without requiring local filesystem access. It includes nativ
Distributes database content across multiple geographic regions to reduce latency and ensure high availability.
ScyllaDB is a distributed NoSQL database engine designed for high-throughput data storage and low-latency performance at scale. It functions as a shard-aware platform that manages large-scale datasets across distributed clusters, providing a foundation for real-time applications that require consistent availability and operational stability. The system distinguishes itself through a shared-nothing architecture that distributes data across independent CPU cores to eliminate lock contention. It incorporates a user-space networking stack and an asynchronous event-driven engine to maximize hardwa
Manages data across multiple networked nodes to provide scalability and fault tolerance.
Citus is a PostgreSQL extension that transforms a standard database into a distributed system. It functions as a sharding framework and distributed SQL engine, enabling horizontal scaling by partitioning tables across a cluster of nodes. By utilizing a coordinator-worker topology, the system manages metadata and routes queries to the appropriate nodes, allowing for parallel execution of complex operations across distributed data shards. The platform distinguishes itself through its specialized support for multi-tenant architectures and real-time analytical processing. It enables tenant-based
Distributed database systems convert existing local tables into distributed ones without blocking application read or write operations during the migration process.
Talent Plan يوفر برامج تدريبية موجهة ومناهج دراسية تركز على تصميم قواعد البيانات الموزعة، وبرمجة الأنظمة، وتدفقات عمل المساهمة في البرمجيات مفتوحة المصدر. يقدم المشروع برنامج تعليمي للأنظمة الموزعة يتكون من دورات ومختبرات منسقة تركز على خفايا قواعد البيانات. يؤكد المنهج على استخدام لغة Rust لبناء تطبيقات شبكية عالية الأداء وتنفيذ الخوارزميات الموزعة. ويدمج مواد تعليمية حول التحكم في الإصدار، وحوكمة المجتمع، والعمليات المحددة المطلوبة للمساهمة في مشاريع البرمجيات العامة. يغطي المشروع مجالات تقنية وتنظيمية واسعة، بما في ذلك هندسة قواعد البيانات الموزعة، وإدارة مجتمع البرمجيات مفتوحة المصدر، وتنسيق التوجيه التقني. ويتضمن تطبيقاً عملياً من خلال بناء مخازن مفتاح-قيمة (key-value stores) مقاومة للأخطاء ودراسة معماريات قواعد البيانات الموزعة الاحترافية. تغطي المواد الإضافية أساسيات البرمجيات مفتوحة المصدر، بما في ذلك حوكمة المشاريع، وترخيص البرمجيات، واستخدام المنصات التعاونية مثل Git وGitHub.
Analyzes the architecture and implementation details of professional distributed database systems.
Cassandra is a distributed NoSQL database and wide-column store designed for high availability and linear scalability. It functions as a fault-tolerant distributed system that utilizes an LSM-tree storage engine to optimize write throughput and manage massive datasets. The system is a CQL-compliant database, using a structured query language to manage and retrieve tabular data stored across multiple nodes. It organizes information into rows and columns based on a flexible schema and primary keys. The project provides capabilities for horizontal database scaling, distributed data partitioning
Uses a structured query language (CQL) to manage and retrieve data from distributed tables.
Orbit DB is a decentralized NoSQL database that utilizes conflict-free replicated data types to ensure eventual consistency across a network of nodes. It functions as a peer-to-peer data store that uses IPFS for content-addressing and synchronization, allowing for the maintenance of application state without a central server or authority. The system is built upon a cryptographically verifiable, immutable operation log, which serves as the foundation for custom decentralized data models. This architecture enables the implementation of various data storage patterns, including JSON document stor
Implements a decentralized NoSQL database utilizing CRDTs to ensure eventual consistency across nodes.
OrbitDB is a decentralized data storage system that enables the creation of serverless databases residing across a network of peers. It functions as a peer-to-peer database that integrates with a content-addressed storage layer to distribute and replicate data without a central server. The system utilizes conflict-free replicated data types to ensure eventual consistency and state convergence across distributed nodes. It maintains an immutable record of updates using a directed acyclic graph to preserve causal ordering and cryptographic integrity. Access is managed through a decentralized ide
Provides a decentralized database using conflict-free replicated data types to ensure eventual consistency across all nodes.
Iroh is a peer-to-peer networking stack and distributed system designed for secure direct connections, content-addressed storage, and synchronized data sharing. It provides a foundation for decentralized applications by combining a QUIC-based networking layer with primitives for distributed state and data transfer. The project distinguishes itself through a comprehensive suite of decentralized capabilities, including a distributed data store using conflict-free replicated data types for collaborative synchronization and a content-addressed storage system for verifiable, resumable transfers of
Implements an eventually consistent distributed store using conflict-free replicated data types for synchronized state.
pgloader is a command-line tool that automates the migration of data and schema from various source databases and file formats into PostgreSQL. It combines schema discovery, parallel data pipelines, and type casting into a single, declarative workflow, using PostgreSQL's COPY protocol for high-throughput bulk loading. The tool distinguishes itself by compiling a dedicated command language into concurrent reader-writer pipelines that handle schema introspection, data transformation, and error-resilient batch processing. It supports migrating entire databases from MySQL, MS SQL, SQLite, and Pos
Migrates data into Citus distributed PostgreSQL clusters with automatic shard distribution.
LiteFS هو نظام ملفات موزع يعتمد على FUSE مصمم لنسخ قواعد بيانات SQLite عبر مجموعة من الأجهزة. يعمل كطبقة توفر عالية تقوم بمزامنة البيانات عن طريق اعتراض عمليات الكتابة لضمان الاتساق عبر عقد خادم متعددة. يدير النظام تخزين قاعدة البيانات الموزعة عن طريق تعيين عمليات الملفات إلى طلبات الشبكة عبر برنامج تشغيل في مساحة المستخدم. وهذا يسمح بمزامنة البيانات عبر مناطق متعددة وتوزيع محتوى قاعدة البيانات على عقد الحافة، مما يسهل القراءات المحلية مع عمليات كتابة عالمية متزامنة. تستخدم عملية النسخ المتماثل شحن سجلات الكتابة المسبقة (WAL) والاعتراض الواعي بالمعاملات لبث التغييرات الملتزم بها من العقدة الأساسية إلى النسخ الاحتياطية. يتم تمهيد النسخ الجديدة من خلال تهيئة قائمة على اللقطات (Snapshots) قبل الانتقال إلى النسخ المتماثل للسجلات التزايدي.
Manages database storage across multiple networked nodes to provide scalability and fault tolerance.
Synapse هو تطبيق لخادم Matrix يوفر البنية التحتية للاتصالات والمراسلة اللامركزية في الوقت الفعلي. يعمل كخادم دردشة موحد يقوم بمزامنة بيانات الغرف وتدفقات الأحداث عبر مثيلات خادم مستقلة لتمكين التوافقية عبر النطاقات. يستخدم الخادم نواة هجينة تدمج المنطق الحرج للأداء في Rust مع طبقة تنسيق Python. يستخدم قاعدة بيانات PostgreSQL علائقية للاحتفاظ بحسابات المستخدمين وسجل المحادثات، ويستخدم نظام مراسلة قائماً على Redis لتوزيع المهام عبر العمال الأفقيين. يغطي المشروع نطاقاً واسعاً من القدرات، بما في ذلك إدارة الهوية الآمنة مع تكامل SAML و OpenID Connect، وأدوات إدارية شاملة للإشراف على المحتوى وإدارة الغرف، ومعالجة الوسائط المؤتمتة. كما يتضمن أنظمة للاتحاد اللامركزي، وترحيل مخطط قاعدة البيانات غير المتزامن، وتصدير القياس عن بُعد لمراقبة الأداء.
Splits the datastore across multiple physical database nodes to improve horizontal scalability and performance.
InternetArchitect is an educational collection of documents and source code designed as a high concurrency architecture course. It serves as a distributed systems implementation guide, providing technical patterns and practical examples for designing scalable internet architectures that maintain stability under heavy traffic loads. The project focuses on high-performance database optimization and microservices design patterns. It covers strategies for reducing latency and increasing throughput via database sharding and proxy layers, as well as coordinating global state across distributed clus
Implements systems that store and manage data across multiple networked nodes for scalability and fault tolerance.