17 个仓库
Configurations for running database systems across multiple nodes for high availability and scale.
Distinguishing note: Focuses on the operational deployment mode for distributed systems.
Explore 17 awesome GitHub repositories matching devops & infrastructure · Distributed Database Clusters. Refine with filters or upvote what's useful.
Milvus is a specialized vector database engine designed for the indexing, management, and high-speed similarity retrieval of high-dimensional vector embeddings. It functions as a similarity search engine capable of identifying nearest neighbors within large-scale vector spaces, supporting the storage and retrieval of billions of data points while maintaining consistent performance. The system utilizes a distributed architecture that decouples storage, query, and coordination into independent services, allowing for horizontal scaling across clusters. It employs a global indexing mechanism that
Supports distributed architecture to handle horizontal scaling across clusters for large-scale production needs.
This project is a feature-rich Go client library designed for interacting with Redis. It serves as a comprehensive interface for managing remote data stores, enabling developers to execute standard database commands, handle complex data structures, and perform asynchronous operations within Go applications. The library distinguishes itself through its support for advanced Redis capabilities, including connection pooling, pipelining, and transactional integrity. It provides specialized primitives for managing distributed clusters, including automated topology updates and request routing to sha
Manages data across distributed cluster deployments to support horizontal scaling and automated request routing.
Dgraph is a distributed graph database designed to store and query highly connected data. It organizes information as nodes and edges to represent complex relationships between entities, providing a platform for managing and analyzing deeply linked datasets. The system functions as a horizontally scalable cluster that partitions data across multiple nodes to maintain performance and availability as information volume increases. It utilizes a specialized query language built for low-latency navigation of interconnected data points, allowing for the execution of complex queries across large-sca
Operates as a distributed storage platform that maintains performance and availability through cluster-based partitioning.
TigerBeetle is a distributed financial accounting database designed for high-volume transaction processing. It functions as a specialized transaction engine that enforces strict double-entry bookkeeping invariants, ensuring that every debit and credit is balanced and accounted for with absolute consistency. By utilizing a consensus-based replication model, the system provides high availability and data durability across geographically distributed clusters, making it suitable for mission-critical financial infrastructure. The system distinguishes itself through a performance-oriented architect
Deploys cluster nodes across multiple sites to maintain transaction processing capabilities during site failures.
Codis is a distributed proxy system designed for scaling Redis clusters. It provides a sharding proxy that distributes data across multiple instances and a cluster manager to oversee the environment. The system enables horizontal scaling through dynamic resharding, which allows data slots to be migrated between servers without interrupting operations. It supports multi-key atomic operations using hash tags to ensure related keys are routed to the same server. The platform includes a graphical cluster management dashboard for monitoring and administration. It implements high availability prox
Distributes requests across a scalable group of instances to enable horizontal growth and high performance.
The AWS Cloud Development Kit is an infrastructure-as-code framework that enables developers to define and provision cloud resources using familiar programming languages. By utilizing construct-based synthesis, it translates high-level, object-oriented code into declarative templates, allowing for the automated management of complex cloud environments through a centralized, code-driven control plane. The framework distinguishes itself through its ability to model infrastructure as a dependency-aware resource graph, ensuring that components are provisioned and updated in the correct order. It
Enables the creation of multi-Region database clusters to support low-latency access and disaster recovery.
YugabyteDB is a distributed SQL database and relational data store designed for horizontal scalability and high availability across multiple nodes or regions. It functions as a cloud-native system that ensures continuous availability and supports PostgreSQL compatible query languages and drivers. The system includes specialized capabilities as a vector database for AI, utilizing high-dimensional indexing to perform similarity searches. It is engineered as a multi-region cloud database that synchronizes data across different geographic locations to maintain global availability. The project co
Employs multi-region deployment strategies to synchronize data across geographic locations for global availability.
Patroni is a high availability manager and cluster orchestrator for PostgreSQL. It functions as an automatic failover controller and replication manager that ensures continuous database availability by automating leader election and promoting standby nodes during failures. The system maintains a consistent cluster state by acting as a distributed consensus coordinator. It synchronizes configuration and manages leader elections through integration with distributed configuration stores such as etcd, ZooKeeper, or Consul. Its broader capabilities include managing both synchronous and asynchrono
Manages the operational deployment and coordination of distributed database clusters across multiple nodes.
pgloader is a command-line tool that automates the migration of data and schema from various source databases and file formats into PostgreSQL. It combines schema discovery, parallel data pipelines, and type casting into a single, declarative workflow, using PostgreSQL's COPY protocol for high-throughput bulk loading. The tool distinguishes itself by compiling a dedicated command language into concurrent reader-writer pipelines that handle schema introspection, data transformation, and error-resilient batch processing. It supports migrating entire databases from MySQL, MS SQL, SQLite, and Pos
Migrates data into Citus distributed PostgreSQL clusters with automatic shard distribution.
PikiwiDB is a distributed NoSQL database and disk-based key-value store that serves as a Redis-compatible protocol server. It is designed to handle datasets larger than available system memory by utilizing a persistence engine that stores the full dataset on disk. The system employs a tiered storage model, caching frequently accessed hot data in memory while maintaining the primary volume on disk. It ensures high availability through a replicated data store architecture, using asynchronous binary logs to synchronize data between primary and secondary nodes. The project supports distributed d
Expands storage capacity across multiple nodes and clusters to handle massive volumes of enterprise data.
FATE is an open-source federated learning platform that enables multiple organizations to collaboratively train machine learning models without exposing raw data to any party. It provides a complete framework for private data collaboration, allowing participants to jointly compute on sensitive information while maintaining data privacy and security guarantees through secure multi-party computation protocols. The platform distinguishes itself through its comprehensive infrastructure management capabilities, supporting automated deployment of multi-party clusters using Ansible-driven provisioni
Setting up a distributed cluster of multiple parties using automation tools for collaborative model training.
AliSQL is a fork of MySQL by Alibaba that extends the relational database management system with enhancements for high performance, scalability, and enterprise-grade availability. It retains the core MySQL identity as a SQL-based database for storing, organizing, and retrieving structured data, while adding optimizations for large-scale transactional and analytical workloads. The project differentiates itself through a set of Alibaba-specific improvements, including a columnar engine for accelerating analytical queries directly on MySQL tables, and a distributed, shared-nothing NDB Cluster en
Manages deployments through dedicated agent and client software for high availability and redundancy.
KubeOperator 是一个综合的 Kubernetes 集群管理平台、基础设施编排器和多集群管理器。它作为一个企业级 Kubernetes 发行版,旨在自动化跨不同云平台和物理机的生产集群的部署、扩展和生命周期管理。 该平台以针对气隙(air-gapped)环境的专业功能而著称,包括一个离线安装引擎,该引擎可生成软件存档并管理私有注册表,以实现安全的非互联网部署。它还提供了一个用于车队操作的集中式仪表板,允许导入外部集群并编排跨不同地理区域和可用区的基础设施。 该系统涵盖了广泛的操作面,包括自动化的虚拟机配置、硬件库存跟踪以及用于修补和扩展的声明式生命周期管理。它集成了备份和恢复服务、带有 LDAP 同步的基于角色的访问控制,以及针对集群健康和性能指标的全面监控。 管理任务和集群操作通过基于 Web 的界面执行。
Automates the deployment and lifecycle of production clusters across diverse cloud platforms and physical machines.
Autobase 是一个自托管的数据库即服务(DBaaS)平台,旨在自动化高可用 PostgreSQL 集群的部署、扩展和管理。它作为一个集群编排器,处理跨多台服务器的复制、故障转移和版本升级。 该平台以其 GitOps 驱动的方法脱颖而出,使用版本控制和 CI/CD 流水线作为单一事实来源,以自动化数据库配置和部署。它提供了一个基于 Web 的管理界面和一个用于配置和监控集群的命令行工具。 该系统涵盖了广泛的操作功能,包括跨云提供商和物理服务器的基础设施配置、用于状态恢复的自动备份和恢复引擎,以及通过分布式负载均衡和基于 DNS 的服务发现进行的流量管理。它还包括用于集群容量扩展和功能扩展管理的工具。
Supports automated deployment of production-ready database clusters across a mix of cloud platforms and bare-metal machines.
Autobase 是一个自托管的 PostgreSQL 数据库管理平台和编排器,旨在提供数据库即服务 (DBaaS) 功能。它自动化了 PostgreSQL 集群的整个生命周期,从初始配置和部署到持续管理和退役。 该系统通过将数据库基础设施视为代码来脱颖而出,允许通过版本控制和持续集成管道部署和更新集群。它既提供用于可视化管理的集中式 Web 控制台,也提供用于自动化基础设施编排的编程接口。 该平台通过使用负载均衡器和副本进行流量分配和资源扩展来涵盖高可用性。其操作功能包括自动时间点备份、状态恢复、功能模块安装,以及跨物理服务器、虚拟机和公共云提供商执行主要和次要版本升级。
Enables the deployment of database clusters across a diverse mix of public cloud providers, virtual machines, and physical servers.
Helix DB is a distributed graph database and knowledge graph platform that persists nodes and edges on object storage for durable and unlimited scaling. It operates as an ACID-compliant system, ensuring data consistency through serializable snapshot isolation during concurrent operations. The project distinguishes itself by combining a vector search engine and a property graph, utilizing hybrid vector and full-text search to locate entry points for graph traversals. It enables dynamic graph querying through a domain-specific language, allowing complex logic and recursive queries to be execute
Operates as a high-availability system with auto-scaling reader nodes and gateways to prevent single points of failure.
Octelium is a zero-trust network access platform and identity-aware proxy designed to secure private HTTP, SSH, and SQL resources. It functions as a secure gateway that validates human and workload identities using OIDC, SAML, and FIDO2 passkeys before granting access to internal applications and SaaS APIs. The system is distinguished by its secretless access broker, which injects credentials—such as API keys, passwords, and AWS Sigv4 signatures—at the gateway level so users can access databases and cloud resources without managing secrets. It further specializes in AI gateway administration,
Extends service availability across multiple clusters and cloud regions with unified policies.