47 Repos
Systems for storing frequently accessed data across multiple nodes to improve retrieval speed.
Distinguishing note: Focuses on backend performance optimization via distributed storage.
Explore 47 awesome GitHub repositories matching data & databases · Distributed Caching. Refine with filters or upvote what's useful.
Backstage is an open-source framework for building internal developer portals. It provides a centralized, metadata-driven software catalog that tracks ownership, dependencies, and lifecycle status for all technical assets by harvesting configuration files directly from version control systems. The platform is built on a plugin-based modular architecture, allowing teams to extend core functionality through isolated, independently deployable modules that integrate into a unified frontend and backend ecosystem. The project distinguishes itself through its focus on developer productivity and stan
Connects backend services to distributed cache providers to optimize system performance.
This repository serves as a comprehensive collection of practical demonstrations and tutorials for building enterprise-ready Java applications using the Spring Boot framework. It provides structured guidance on core development topics, including the implementation of inversion-of-control containers, auto-configuration mechanisms, and convention-over-configuration patterns to simplify the assembly of complex systems. The project distinguishes itself by offering implementation patterns for diverse architectural requirements, such as microservices development, reactive programming models for hig
Optimizes system performance by implementing distributed caching to reduce database load.
This project is a build orchestration engine and development toolkit designed for managing large-scale monorepos. It provides a unified workspace environment that maps project relationships and dependencies, enabling the system to perform intelligent impact analysis and execute only the tasks affected by specific code changes. The system distinguishes itself through a persistent daemon that monitors file changes for near-instant feedback and a content-addressable caching mechanism that stores task outputs to prevent redundant computation across local and remote environments. It further suppor
Shares build and test results across team members and CI environments to eliminate redundant computation and accelerate development cycles.
Redisson is a Java library and Redis client that functions as a distributed Java object mapper, caching provider, and locking framework. It maps Java collections and concurrency primitives to distributed implementations backed by Redis and Valkey, providing synchronous, asynchronous, and reactive APIs for interacting with these data stores. The project distinguishes itself by providing a comprehensive suite of distributed coordination tools, including a locking framework for managing semaphores and countdown latches across multiple application nodes. It also serves as a distributed messaging
Integrates Java applications with Redis or Valkey to improve response times through distributed caching.
This project is a feature-rich Go client library designed for interacting with Redis. It serves as a comprehensive interface for managing remote data stores, enabling developers to execute standard database commands, handle complex data structures, and perform asynchronous operations within Go applications. The library distinguishes itself through its support for advanced Redis capabilities, including connection pooling, pipelining, and transactional integrity. It provides specialized primitives for managing distributed clusters, including automated topology updates and request routing to sha
Stores frequently accessed data in memory to provide sub-millisecond read and write performance for high-scale applications.
OpenList is a cloud storage indexing platform that transforms remote file collections into searchable lists and standardized streaming media endpoints. It functions as a centralized gateway, allowing users to connect external storage providers and manage their data through a unified interface. The platform distinguishes itself by providing a dedicated security layer for API authentication and traffic proxying, which protects user credentials while managing connectivity for distributed components. It also features automated service lifecycle management, enabling the deployment and maintenance
Accelerates file retrieval by preheating distributed cache nodes to increase cache hit rates.
This project is an educational framework designed to teach the fundamentals of building core distributed systems and web services from scratch in Go. It provides a collection of modular implementations that demonstrate how to construct essential infrastructure components, including web servers, remote procedure call systems, distributed caches, and database abstraction layers. The framework distinguishes itself by focusing on the internal mechanics of these systems rather than providing a high-level abstraction for production use. It covers the implementation of complex architectural patterns
The framework implements networked caching systems using LRU eviction, consistent hashing, and protocol-based communication to prevent data access bottlenecks.
Presto is a distributed SQL query engine designed for high-performance analytical processing across heterogeneous data sources. It functions as a data federation platform and massively parallel processing engine, allowing users to execute interactive queries against diverse storage systems without requiring data migration. By mapping remote metadata and structures to a unified relational namespace, it enables seamless cross-platform analysis through a standard SQL interface. The engine distinguishes itself through a pluggable connector architecture and a shared-nothing distributed processing
Stores files and objects from remote storage in a local cache layer to speed up data retrieval.
FoundationDB is an ACID-compliant distributed transactional key-value store. It functions as a scalable database engine that ensures strict serializability and data consistency across a cluster of servers using a shared-nothing architecture. The system is distinguished by its multi-region replication capabilities, allowing data to be synchronized across different datacenters for high availability and disaster recovery. It utilizes optimistic concurrency control to manage distributed transactions and employs a majority-based coordination system to maintain cluster state. The platform provides
Provides distributed caching that leverages aggregate cluster memory while maintaining full ACID guarantees.
PeerTube is a decentralized, open-source video hosting platform that enables users to operate independent, interoperable servers. By utilizing the ActivityPub protocol, it connects these servers into a global, federated network where users can follow channels, discover content, and interact across different instances. The platform is designed to function as a self-hosted video content management system, providing a community-driven alternative to centralized media services. What distinguishes PeerTube is its hybrid approach to content delivery and infrastructure management. It integrates peer
Notifies origin servers about available file copies on external nodes to enable faster multi-source downloading and better network efficiency.
Memcached is a high-performance, distributed, in-memory key-value storage and request routing engine. It functions as a volatile data store designed to accelerate dynamic applications by caching objects in RAM, thereby reducing backend database load and providing sub-millisecond response times. The system utilizes a specialized architecture that organizes memory into fixed-size slabs to minimize fragmentation and maximize throughput for high-concurrency workloads. The project distinguishes itself through a multi-threaded, lock-friendly design that scales across CPU cores and supports complex
Provides a request routing layer that manages backend server pools and load balancing for caching infrastructure.
Groupcache is a distributed caching library designed to coordinate data retrieval and storage across a cluster of nodes. It functions as a peer-to-peer data store that uses consistent hashing to assign specific keys to canonical owners, ensuring that cached items remain predictable and accessible throughout the network. The system distinguishes itself through a request coalescing engine that merges concurrent requests for the same missing key into a single upstream fetch. This mechanism prevents redundant backend load by ensuring that only one process retrieves the required data while sharing
Provides a library for coordinating distributed data retrieval and caching across a cluster of nodes using consistent hashing.
Codis is a distributed proxy system designed for scaling Redis clusters. It provides a sharding proxy that distributes data across multiple instances and a cluster manager to oversee the environment. The system enables horizontal scaling through dynamic resharding, which allows data slots to be migrated between servers without interrupting operations. It supports multi-key atomic operations using hash tags to ensure related keys are routed to the same server. The platform includes a graphical cluster management dashboard for monitoring and administration. It implements high availability prox
Provides a distributed proxy system that manages data sharding and request routing across a Redis cluster.
KeyDB is a multithreaded in-memory key-value store and distributed cache. It functions as a NoSQL database utilizing multi-version concurrency control to execute non-blocking queries and scans. The project is a multithreaded fork of Redis that maintains protocol compatibility while utilizing a multithreaded architecture to scale across multi-core hardware. It distinguishes itself with flash-tiered storage, allowing the system to offload data from primary RAM to SSD or flash storage to increase total capacity. The system supports high availability through active-active mesh replication and mu
Implements a distributed caching system with active-active mesh replication for high availability.
Twemproxy is a lightweight proxy that routes and distributes requests across multiple Redis and Memcached backend servers. It functions as a protocol translation gateway and distributed cache shard manager, partitioning data across clusters to balance load and storage capacity. The system acts as a high-availability cache orchestrator, employing health monitoring and automatic server ejection to maintain continuous access to cached data. It integrates with sentinels for dynamic master and replica discovery and utilizes consistent hashing and tag-based key grouping to manage data distribution
Partitions data across cache clusters using hashing to balance load and storage capacity.
Owncast is a self-hosted live streaming server that provides full control over broadcast infrastructure and audience data. It functions as an RTMP video streaming server, accepting incoming video feeds and distributing them to viewers through HLS-based segmented streaming. The platform includes a built-in, stateful web-based chat interface that enables real-time viewer engagement during broadcasts. The project distinguishes itself through deep integration with the decentralized Fediverse, allowing servers to automatically broadcast stream status updates and notify followers across distributed
Supports caching and serving video segments from distributed storage to improve global playback performance.
q is a command-line utility for the processing, filtering, and aggregation of tabular text and database files using standard SQL syntax. It functions as a query engine that treats CSV and TSV files, as well as standard input, as relational database tables. The tool distinguishes itself by providing a persistent cache layer that stores processed tabular data in a binary format to accelerate repeated queries on large datasets. It also maps individual filenames or stream identifiers to relational table names, enabling SQL joins across disparate text files. The project covers a broad range of da
Stores processed tabular data in a binary format to bypass repetitive parsing of large raw text files.
phpredis is a C-based native extension that bridges PHP applications with Redis servers for high-performance data storage and retrieval. It serves as an interface for manipulating strings, hashes, lists, sets, and sorted sets while providing a direct path for executing Redis commands and server-side scripts. The extension provides comprehensive support for distributed environments and high availability. It interfaces with Redis Cluster to distribute data across multiple nodes using hash slots and manages Redis Sentinel for service discovery and automatic failover. It also enables shared state
Interfaces with Redis Cluster to distribute data across multiple nodes using sharding and hash slots.
3FS is a distributed file system and RDMA storage cluster designed for high-performance AI training and inference workloads. It functions as a strongly consistent storage layer that utilizes a disaggregated architecture to pool SSDs and memory resources across multiple nodes. The system provides specialized storage implementations including an AI training checkpoint store for parallel state preservation and a distributed key-value cache store for decoder layer vectors to optimize inference processing. It ensures data integrity through chain replication and apportioned query distribution. The
Provides a distributed storage solution for caching decoder layer key and value vectors to optimize inference.
CRMEB is a comprehensive e-commerce platform built on ThinkPHP 6, designed as a headless system that delivers standardized APIs to various frontend clients. It provides a unified backend to synchronize product catalogs, orders, and customer data across web browsers, mobile applications, and mini-programs. The platform supports diverse commerce models, including multi-vendor marketplaces where independent merchants manage their own stores, centralized chain store networks, and social commerce frameworks featuring affiliate distribution and community group buying. It also integrates specialized
Utilizes distributed caching across multiple nodes to reduce backend load and accelerate data retrieval.