# Distributed Horizontal Scaling SQL Databases

> Search results for `distributed SQL database that scales horizontally` on awesome-repositories.com. 114 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/distributed-sql-database-that-scales-horizontally

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/distributed-sql-database-that-scales-horizontally).**

## Results

- [cockroachdb/cockroach](https://awesome-repositories.com/repository/cockroachdb-cockroach.md) (32,207 ⭐) — Cockroach is a distributed SQL database designed to scale horizontally across multiple nodes while maintaining strict ACID compliance and global data consistency. It functions as a relational database engine that automatically partitions data into ranges, rebalancing them across a cluster to accommodate growing storage and throughput requirements. By utilizing a distributed consensus protocol, the system ensures that all nodes agree on the order of operations, providing fault tolerance and continuous availability even in the event of hardware failures.

The system distinguishes itself through
- [distribution/distribution](https://awesome-repositories.com/repository/distribution-distribution.md) (10,479 ⭐) — Distribution is an open-source container image registry that implements the OCI Distribution Specification, enabling any OCI-compatible client to push, pull, and manage container images over standard protocols. It serves as a content distribution toolkit for packaging, shipping, storing, and delivering container content across networked environments, storing and retrieving content by its cryptographic hash for integrity and deduplication.

The registry separates image metadata from bulk data to enable efficient validation and partial pulls, and supports resumable blob uploads with chunked tran
- [apache/shardingsphere](https://awesome-repositories.com/repository/apache-shardingsphere.md) (20,737 ⭐) — ShardingSphere is a distributed SQL database middleware that provides sharding, read-write splitting, and distributed transaction management for relational databases. It functions as a layer that intercepts SQL queries to distribute data across multiple physical database instances for horizontal scaling.

The project is distinguished by its ability to operate as either a standalone transparent database proxy or via direct integration as a JDBC driver. It features a SQL dialect translator that parses queries into abstract syntax trees to convert syntax between different database engines, enabli
- [cube-js/cube](https://awesome-repositories.com/repository/cube-js-cube.md) (20,251 ⭐) — Cube is a semantic data layer that provides a unified framework for defining business metrics, dimensions, and relationships across diverse data sources. By acting as a headless business intelligence engine, it transforms raw data into a governed model that can be queried via SQL, REST, and GraphQL interfaces. This architecture ensures consistent data definitions and logic across all downstream analytical applications and reporting tools.

The platform distinguishes itself through its integrated conversational AI capabilities, which allow users to explore data using natural language. It orches
- [donnemartin/system-design-primer](https://awesome-repositories.com/repository/donnemartin-system-design-primer.md) (353,387 ⭐) — This project is a comprehensive educational resource and study guide focused on distributed systems architecture and backend infrastructure design. It provides a structured curriculum for mastering the principles of scalability, reliability, and performance required to design complex software systems.

The repository distinguishes itself by offering a methodical approach to technical interview preparation, incorporating design patterns, architectural trade-offs, and spaced repetition tools to help users retain complex concepts. It emphasizes constraint-driven analysis, teaching users how to ev
- [dragonflydb/dragonfly](https://awesome-repositories.com/repository/dragonflydb-dragonfly.md) (30,688 ⭐) — Dragonfly is a high-performance, multi-model in-memory data store designed to serve as a drop-in replacement for existing database infrastructures. By utilizing a multi-threaded, shared-nothing architecture and a fiber-based concurrency model, it maximizes CPU utilization and minimizes latency for read and write operations. The system supports a wide range of data structures, including strings, hashes, lists, sets, sorted sets, and JSON documents, while maintaining full compatibility with standard industry wire protocols and client libraries.

What distinguishes Dragonfly is its focus on effic
- [dyazincahya/kbbi-sql-database](https://awesome-repositories.com/repository/dyazincahya-kbbi-sql-database.md) (45 ⭐) — Kamus Besar Bahasa Indonesia (KBBI) SQL Database, total data ``115.978`` kata.
- [starrocks/starrocks](https://awesome-repositories.com/repository/starrocks-starrocks.md) (11,789 ⭐) — StarRocks is a distributed SQL OLAP database engine designed for real-time analytics and high-performance multi-dimensional analysis. It functions as a data lakehouse query engine that enables SQL execution across large datasets and external open table formats without requiring local data imports.

The system employs a shared-nothing distributed architecture and utilizes the MySQL protocol to integrate with business intelligence tools. It maintains real-time data consistency through a primary key upsert model and accelerates query response times using vectorized execution and cost-based optimi
- [yugabyte/yugabyte-db](https://awesome-repositories.com/repository/yugabyte-yugabyte-db.md) (10,349 ⭐) — YugabyteDB is a distributed SQL database and relational data store designed for horizontal scalability and high availability across multiple nodes or regions. It functions as a cloud-native system that ensures continuous availability and supports PostgreSQL compatible query languages and drivers.

The system includes specialized capabilities as a vector database for AI, utilizing high-dimensional indexing to perform similarity searches. It is engineered as a multi-region cloud database that synchronizes data across different geographic locations to maintain global availability.

The project co
- [fastapi/fastapi](https://awesome-repositories.com/repository/fastapi-fastapi.md) (99,260 ⭐) — FastAPI is a web framework for building APIs with Python. It leverages standard language type hints to provide automatic data validation, request parsing, and interactive API documentation generation. The framework supports asynchronous request handling and manages execution contexts to prevent blocking the main event loop.

The project includes a dependency injection system that allows for the resolution and injection of reusable components into request handlers. This system supports request-scoped caching, lifecycle management, and integration with security mechanisms like OAuth2 and JSON We
- [embarcadero/dev-cpp](https://awesome-repositories.com/repository/embarcadero-dev-cpp.md) (2,882 ⭐) — Dev-Cpp is a comprehensive development suite that serves as a C++ integrated development environment, a cross-platform application builder, and a visual UI designer. It provides a toolchain for writing, compiling, and debugging native C++ applications on Windows, while offering a framework to create native binaries for desktop, mobile, and IoT devices from a single codebase.

The project distinguishes itself by integrating an embedded SQL database engine and a REST API development platform directly into the workflow. It includes an AI-assisted coding tool that leverages large language models t
- [tursodatabase/turso](https://awesome-repositories.com/repository/tursodatabase-turso.md) (21,655 ⭐) — Turso is a distributed SQL database platform that provides managed, edge-hosted SQLite instances. It functions as a serverless database provider, enabling the deployment of relational databases that synchronize data across multiple geographic regions to support high availability and performance.

The platform distinguishes itself by utilizing a fork of SQLite as its core storage engine, which supports both local file storage and remote network-based replication. It employs an edge-optimized proxy to route queries through a global network, minimizing latency by connecting users to the nearest d
- [sql-js/sql.js](https://awesome-repositories.com/repository/sql-js-sql-js.md) (13,632 ⭐) — sql.js is a javascript SQL database. It allows you to create a relational database and query it entirely in the browser. You can try it in this online demo. It uses a virtual database file stored in memory, and thus doesn't persist the changes made to the database. However, it allows you to…
- [windofshadow/that](https://awesome-repositories.com/repository/windofshadow-that.md) (121 ⭐) — This repository contains the Pytorch implementation of the THAT methods in the following paper:
- [oceanbase/oceanbase](https://awesome-repositories.com/repository/oceanbase-oceanbase.md) (9,980 ⭐) — OceanBase is a distributed SQL database designed for high availability and strong consistency across multiple nodes and regions. It functions as a hybrid transactional and analytical processing engine, allowing real-time analytics and transactions to execute on a single data copy. The system also serves as a vector database engine for indexing and querying vector data to power semantic search and recommendation systems.

The platform features native compatibility layers for MySQL and Oracle, enabling the migration of legacy workloads without rewriting SQL code. It utilizes a Paxos-based distri
- [kubernetes-sigs/metrics-server](https://awesome-repositories.com/repository/kubernetes-sigs-metrics-server.md) (6,651 ⭐) — Metrics Server is a lightweight, single-purpose daemon that collects CPU and memory usage data from every node and pod in a Kubernetes cluster and exposes those metrics through a standard Kubernetes API endpoint. It registers as an aggregated extension API server behind the Kubernetes apiserver, making resource utilization data available to the Horizontal Pod Autoscaler and Vertical Pod Autoscaler for automatic replica count and resource request adjustments.

The project distinguishes itself by operating as a focused, in-cluster resource metrics collector that polls kubelet summary endpoints a
- [auraphp/aura.sql](https://awesome-repositories.com/repository/auraphp-aura-sql.md) (564 ⭐) — SQL database access through PDO.
- [grafana/xk6-sql](https://awesome-repositories.com/repository/grafana-xk6-sql.md) (190 ⭐) — Use SQL databases from k6 tests.
- [boto/boto3](https://awesome-repositories.com/repository/boto-boto3.md) (9,834 ⭐) — Boto3 is the AWS SDK for Python, providing a programmatic interface for managing and automating AWS cloud infrastructure and services. It serves as a cloud management API client and resource manager for provisioning, configuring, and scaling virtual servers, databases, and storage.

The library enables the implementation of infrastructure-as-code through declarative templates and scripts, allowing for the deployment of identical resource stacks across multiple accounts and geographic regions. It also provides a framework for coordinating distributed workflows, serverless functions, and contain
- [sciruby/distribution](https://awesome-repositories.com/repository/sciruby-distribution.md) (51 ⭐) — Probability distributions for Ruby.
- [apache/superset](https://awesome-repositories.com/repository/apache-superset.md) (73,451 ⭐) — Superset is a web-based business intelligence platform designed for data exploration, visualization, and interactive dashboarding. It functions as a query-driven analytics engine that connects to various SQL databases, allowing users to perform ad-hoc analysis, define virtual metrics, and build complex data visualizations through a centralized interface.

The platform distinguishes itself through a robust semantic layer that transforms raw database schemas into calculated columns and virtual metrics, enabling consistent business logic across an organization. It features a plugin-based visualiz
- [dask/distributed](https://awesome-repositories.com/repository/dask-distributed.md) (1,671 ⭐) — A distributed task scheduler for Dask
- [avelino/awesome-go](https://awesome-repositories.com/repository/avelino-awesome-go.md) (175,576 ⭐) — This project serves as a comprehensive language ecosystem index, functioning as a centralized, community-curated directory for the Go programming language. It organizes a vast landscape of software components, libraries, and development tools into a structured, navigable hierarchy, enabling developers to efficiently discover resources tailored to specific functional domains.

The repository distinguishes itself through a decentralized contribution model, where community-driven updates ensure the index remains current with the rapidly evolving software landscape. Beyond simple resource listing,
- [golang/go](https://awesome-repositories.com/repository/golang-go.md) (134,756 ⭐) — Go is a statically typed, compiled programming language designed for building scalable, concurrent software. It provides a memory-safe execution environment that combines a high-performance runtime with a self-hosting compiler toolchain, enabling the creation of statically linked machine code binaries without external dependencies. The language is built around a structural type system that uses interfaces for polymorphism and a concurrency model based on lightweight, stack-based coroutines that communicate through channels.

The language distinguishes itself through a runtime that features a c
- [openbeer/schema.sql](https://awesome-repositories.com/repository/openbeer-schema-sql.md) (9 ⭐) — beer.db database schema (tables, views, etc.) scripts (in SQL) and documentation
- [pingcap/tidb](https://awesome-repositories.com/repository/pingcap-tidb.md) (40,166 ⭐) — TiDB is a horizontally scalable, distributed SQL database designed to provide consistent transactional storage and high-performance analytical processing within a single unified architecture. It utilizes a decoupled compute-storage design and a distributed key-value storage layer to ensure horizontal scalability and efficient range-based queries. By employing a consensus-based replication algorithm, the system maintains high availability and automatic failover across multiple nodes and geographical regions.

The platform distinguishes itself through its hybrid transactional and analytical proc
- [kubernetes/kops](https://awesome-repositories.com/repository/kubernetes-kops.md) (16,631 ⭐) — kops is a Kubernetes cluster provisioner and lifecycle manager designed to automate the creation, maintenance, and destruction of production-grade clusters on cloud infrastructure. It functions as a declarative infrastructure manager, synchronizing the live state of a cluster with versioned manifests stored in remote object storage to ensure idempotent operations.

The project distinguishes itself by offering comprehensive automation for the entire cluster lifecycle, including high-availability control plane deployment, incremental rolling updates, and automated version upgrades. It also serve
- [erikgrinaker/toydb](https://awesome-repositories.com/repository/erikgrinaker-toydb.md) (7,251 ⭐) — ToyDB is a distributed SQL database that provides a system for storing and querying data across multiple nodes. It focuses on maintaining strong consistency and fault tolerance through the implementation of a distributed consensus algorithm.

The project distinguishes itself by supporting historical data versioning, enabling time-travel queries to retrieve the state of the database from a specific point in the past. It utilizes multi-version concurrency control to manage ACID transactions and ensure data integrity during concurrent operations.

The system covers relational data modeling with t
- [openfootball/schema.sql](https://awesome-repositories.com/repository/openfootball-schema-sql.md) (50 ⭐) — football.db database schema (tables, views, etc.) scripts (in SQL) and documentation
- [flutter/flutter](https://awesome-repositories.com/repository/flutter-flutter.md) (177,056 ⭐) — This project is a multi-platform UI framework designed for building applications that target mobile, web, and desktop environments from a single codebase. It utilizes a declarative paradigm where the user interface is defined as a function of application state, supported by a layered architecture that includes a high-performance rendering engine and a multi-platform compilation model.

The framework provides a comprehensive suite of developer tools, including hot reloading for real-time code injection and diagnostic utilities for monitoring application state and performance. It features a modu
- [tursodatabase/libsql](https://awesome-repositories.com/repository/tursodatabase-libsql.md) (16,887 ⭐) — LibSQL is a high-performance, distributed SQL database engine that extends SQLite to support remote network access, edge computing, and real-time synchronization. It functions as an embedded database library that integrates directly into application processes while providing the infrastructure to maintain consistency across multiple geographic regions.

The platform distinguishes itself by enabling database interaction over standard HTTP protocols, allowing applications to query remote data sources in serverless and edge environments without requiring local filesystem access. It includes nativ
- [encode/databases](https://awesome-repositories.com/repository/encode-databases.md) (4,002 ⭐) — Async database support for Python. 🗄
- [aws/aws-cdk](https://awesome-repositories.com/repository/aws-aws-cdk.md) (12,817 ⭐) — The AWS Cloud Development Kit is an infrastructure-as-code framework that enables developers to define and provision cloud resources using familiar programming languages. By utilizing construct-based synthesis, it translates high-level, object-oriented code into declarative templates, allowing for the automated management of complex cloud environments through a centralized, code-driven control plane.

The framework distinguishes itself through its ability to model infrastructure as a dependency-aware resource graph, ensuring that components are provisioned and updated in the correct order. It
- [flutter-team-archive/plugins](https://awesome-repositories.com/repository/flutter-team-archive-plugins.md) (17,710 ⭐) — This project is a collection of official plugin packages and a native integration library designed to provide a consistent interface for accessing hardware and software functionality across different mobile and desktop platforms. It serves as a native platform bridge, enabling cross-platform applications to invoke native code and manage operating system dependencies.

The project utilizes a federated plugin architecture, splitting plugins into common interfaces and separate platform implementations to allow for independent development and extension. It further supports native integration throu
- [susom/database](https://awesome-repositories.com/repository/susom-database.md) (43 ⭐) — The point of this project is to provide a simplified way of accessing databases. It is a wrapper around the JDBC driver, and tries to hide some of the more error-prone, unsafe, and non-portable parts of the standard API. It uses standard Java types for all operations (as opposed to java.sql.*),…
- [docker/distribution](https://awesome-repositories.com/repository/docker-distribution.md) (10,474 ⭐) — This project is a container image registry and server-side storage system designed to house container images, layers, and manifests. It functions as an OCI compliant registry server that adheres to the Open Container Initiative Distribution Specification to store and deliver content over HTTP.

The system provides a self-hosted solution for managing private libraries of container images within professional-grade infrastructure. It is designed to enable the development of custom registries by extending a base toolkit with specialized libraries and business logic.

The registry covers image dist
- [duckdb/duckdb](https://awesome-repositories.com/repository/duckdb-duckdb.md) (38,805 ⭐) — DuckDB is an in-process analytical database engine designed to run directly within an application process. As a zero-dependency, embedded system, it provides enterprise-grade SQL data processing capabilities without the overhead of managing a dedicated database server. It is built to handle complex analytical and aggregation tasks by storing and retrieving information in columns, allowing for high-performance relational data manipulation.

The engine distinguishes itself through a columnar vectorized execution model that maximizes CPU cache efficiency during query operations. It employs adapti
- [dolthub/dolt](https://awesome-repositories.com/repository/dolthub-dolt.md) (23,592 ⭐) — Dolt is a relational database engine that integrates version control directly into the database management layer. It functions as a version-controlled SQL database that tracks every row and schema change using a commit-based history, allowing users to branch, merge, and audit data modifications. By implementing a wire-protocol-compatible server, the system enables standard SQL clients and tools to interact with versioned data as if they were connecting to a traditional relational database.

The platform distinguishes itself by applying repository-style workflows to data management, including s
- [nodesource/distributions](https://awesome-repositories.com/repository/nodesource-distributions.md) (13,834 ⭐) — This project is a Node.js binary distribution repository and Linux package repository. It provides a hosted set of pre-compiled JavaScript runtime binaries for various Linux distributions to simplify installation and version management through native package managers.

The project includes a Node.js observability toolset and security policy manager. These components enable the gathering of runtime telemetry to monitor application health and performance via diagnostic dashboards, while providing a resource restriction layer that intercepts system calls to prevent unauthorized modules from acces
- [alibaba/alisql](https://awesome-repositories.com/repository/alibaba-alisql.md) (5,706 ⭐) — AliSQL is a fork of MySQL by Alibaba that extends the relational database management system with enhancements for high performance, scalability, and enterprise-grade availability. It retains the core MySQL identity as a SQL-based database for storing, organizing, and retrieving structured data, while adding optimizations for large-scale transactional and analytical workloads.

The project differentiates itself through a set of Alibaba-specific improvements, including a columnar engine for accelerating analytical queries directly on MySQL tables, and a distributed, shared-nothing NDB Cluster en
- [bytebytegohq/system-design-101](https://awesome-repositories.com/repository/bytebytegohq-system-design-101.md) (83,491 ⭐) — This project is a centralized engineering knowledge repository that provides a structured curriculum for mastering system design, architectural patterns, and fundamental software development workflows. It serves as a professional development resource for engineers, offering foundational knowledge and real-world case studies to support the design of scalable, secure, and efficient distributed systems.

The repository distinguishes itself through a visual-first approach to knowledge synthesis, distilling complex technical concepts into high-density graphical diagrams and succinct illustrations.
- [illuminate/database](https://awesome-repositories.com/repository/illuminate-database.md) (2,766 ⭐) — [READ ONLY] Subtree split of the Illuminate Database component (see laravel/framework)
- [ibax-io/go-ibax](https://awesome-repositories.com/repository/ibax-io-go-ibax.md) (7,858 ⭐) — go-ibax is a blockchain protocol platform and decentralized application infrastructure used to deploy networks with custom governance and token economics. It provides a foundation for building decentralized applications through a framework that integrates identity management and on-chain data storage.

The project features a multilingual virtual machine capable of executing smart contracts written in Go, Rust, and Solidity. It implements a sharded blockchain network to increase throughput and a privacy layer utilizing zero-knowledge proofs and homomorphic encryption to anonymize transaction da
- [inverse-scaling/prize](https://awesome-repositories.com/repository/inverse-scaling-prize.md) (621 ⭐) — A prize for finding tasks that cause large language models to show inverse scaling
- [kamranahmedse/developer-roadmap](https://awesome-repositories.com/repository/kamranahmedse-developer-roadmap.md) (357,434 ⭐) — Developer Roadmap is a community-driven platform that provides structured, graph-based learning paths for software engineering. It serves as a comprehensive knowledge repository where technical domains are organized into visual sequences to guide professional skill acquisition and career growth.

The project distinguishes itself through a collaborative ecosystem that enables users to contribute roadmaps, curate industry best practices, and maintain professional profiles. It integrates diagnostic assessment frameworks to evaluate technical proficiency, helping developers identify knowledge gaps
- [alibaba/datax](https://awesome-repositories.com/repository/alibaba-datax.md) (17,241 ⭐) — DataX is a distributed data integration framework and plugin-based ETL tool designed for synchronizing large datasets between heterogeneous sources and destinations. It functions as a JDBC data migration engine and offline synchronization tool, enabling the movement of data between relational databases, NoSQL stores, and object storage.

The system utilizes a plugin-based connector architecture that decouples reader and writer logic, allowing it to map and transform data types across different storage engines using a standardized internal representation. This design supports heterogeneous data
- [flowiseai/flowise](https://awesome-repositories.com/repository/flowiseai-flowise.md) (53,641 ⭐) — Flowise is a low-code platform designed for building and deploying complex language model workflows through a visual, node-based interface. It functions as an orchestrator for autonomous multi-agent systems, allowing users to construct conversational pipelines by connecting language models, memory stores, and external tools on a drag-and-drop canvas.

The platform distinguishes itself through its support for sophisticated agentic patterns, including supervisor-worker delegation and iterative reasoning strategies. Users can design directed acyclic graphs to manage conditional branching, state p
- [mouredev/hello-sql](https://awesome-repositories.com/repository/mouredev-hello-sql.md) (8,826 ⭐) — hello-sql is a collection of educational resources and practical guides designed for mastering relational database design, SQL query writing, and schema mapping. It provides a set of lessons and exercises for practicing the creation and manipulation of data within relational databases.

The project includes a database schema workbook for designing tables and mapping relationships, alongside a dedicated SQL query guide for writing selection, filtering, and aggregation statements. These resources are delivered through a relational database tutorial and a broader SQL learning resource.

The mater
- [citusdata/citus](https://awesome-repositories.com/repository/citusdata-citus.md) (12,562 ⭐) — Citus is a PostgreSQL extension that transforms a standard database into a distributed system. It functions as a sharding framework and distributed SQL engine, enabling horizontal scaling by partitioning tables across a cluster of nodes. By utilizing a coordinator-worker topology, the system manages metadata and routes queries to the appropriate nodes, allowing for parallel execution of complex operations across distributed data shards.

The platform distinguishes itself through its specialized support for multi-tenant architectures and real-time analytical processing. It enables tenant-based
- [home-assistant/core](https://awesome-repositories.com/repository/home-assistant-core.md) (87,753 ⭐) — Home Assistant is a centralized home automation platform designed to orchestrate diverse internet-connected devices and services. It functions as a local-first control system that normalizes heterogeneous hardware protocols into a unified set of entities, attributes, and services. The core architecture relies on an event-driven state bus and a modular integration model, allowing the system to manage state changes and communicate across decoupled components through standardized interfaces.

The platform distinguishes itself through a highly flexible, declarative configuration framework that all
