Why is prestodb/presto a recommended Distributed Data Caching Layers GitHub Repositories repository?

Stores files and objects from remote storage in a local cache layer to speed up data retrieval.

Why is harelba/q a recommended Distributed Data Caching Layers GitHub Repositories repository?

Stores processed tabular data in a binary format to bypass repetitive parsing of large raw text files.

Why is alluxio/alluxio a recommended Distributed Data Caching Layers GitHub Repositories repository?

Functions as a virtual distributed file system that abstracts and caches data across diverse storage backends.

Why is balloonwj/cppguide a recommended Distributed Data Caching Layers GitHub Repositories repository?

Guides building a thread-safe, sharded distributed cache with configurable eviction policies.

Why is orcaman/concurrent-map a recommended Distributed Data Caching Layers GitHub Repositories repository?

Provides a thread-safe in-memory cache supporting parallel reads and writes without race conditions.

Why is xacrimon/dashmap a recommended Distributed Data Caching Layers GitHub Repositories repository?

Provides the underlying primitives necessary to build thread-safe sharded caches.

Why is facebookincubator/cinder a recommended Distributed Data Caching Layers GitHub Repositories repository?

Provides a high-performance data layer via a scalable network of memory nodes.

Why is grantjenks/python-diskcache a recommended Distributed Data Caching Layers GitHub Repositories repository?

Implements sharded data partitioning to divide the cache into multiple storage buckets, reducing write contention.

8 Repos

Awesome GitHub RepositoriesDistributed Data Caching Layers

Systems that store remote files and objects in a local cache to reduce latency for analytical queries.

Distinct from Distributed Caching: Distinct from general distributed caching: focuses on caching remote storage data for analytical query performance.

Explore 8 awesome GitHub repositories matching data & databases · Distributed Data Caching Layers. Refine with filters or upvote what's useful.

Finde die besten Repos mit KI.Wir suchen mit KI nach den am besten passenden Repositories.

prestodb/presto
prestodb/presto
16,711Auf GitHub ansehen
Presto is a distributed SQL query engine designed for high-performance analytical processing across heterogeneous data sources. It functions as a data federation platform and massively parallel processing engine, allowing users to execute interactive queries against diverse storage systems without requiring data migration. By mapping remote metadata and structures to a unified relational namespace, it enables seamless cross-platform analysis through a standard SQL interface. The engine distinguishes itself through a pluggable connector architecture and a shared-nothing distributed processing
Stores files and objects from remote storage in a local cache layer to speed up data retrieval.
Javabig-datadatahadoop
Auf GitHub ansehen16,711
harelba/q
harelba/q
10,353Auf GitHub ansehen
q is a command-line utility for the processing, filtering, and aggregation of tabular text and database files using standard SQL syntax. It functions as a query engine that treats CSV and TSV files, as well as standard input, as relational database tables. The tool distinguishes itself by providing a persistent cache layer that stores processed tabular data in a binary format to accelerate repeated queries on large datasets. It also maps individual filenames or stream identifiers to relational table names, enabling SQL joins across disparate text files. The project covers a broad range of da
Stores processed tabular data in a binary format to bypass repetitive parsing of large raw text files.
Pythonclicommand-linecommand-line-tool
Auf GitHub ansehen10,353
alluxio/alluxio
Alluxio/alluxio
7,202Auf GitHub ansehen
Alluxio is a virtual distributed file system and data orchestration layer that serves as a high-performance caching layer between cloud storage and compute clusters. It acts as a distributed data cache designed to accelerate data access for large-scale analytics and machine learning workloads. The system provides a unified interface that presents multiple heterogeneous storage backends as a single coherent namespace. This allows for the unification of diverse storage systems, enabling computation engines to access data from different providers without changing application code. The project c
Functions as a virtual distributed file system that abstracts and caches data across diverse storage backends.
Java
Auf GitHub ansehen7,202
balloonwj/cppguide
balloonwj/CppGuide
6,030Auf GitHub ansehen
CppGuide is a curated collection of educational resources and practical guides focused on C++ server development, Linux kernel internals, concurrent programming, network protocols, and security exploitation. It provides structured learning paths for backend developers, covering everything from interview preparation to building high-performance network servers and understanding operating system fundamentals. The guide distinguishes itself by offering in-depth, hands-on tutorials that walk through real-world implementations, including building a Redis-like server from scratch, designing custom
Guides building a thread-safe, sharded distributed cache with configurable eviction policies.
Auf GitHub ansehen6,030
orcaman/concurrent-map
orcaman/concurrent-map
4,528Auf GitHub ansehen
Concurrent-map is a lock-striped hash map and sharded concurrent cache for Go, designed as a high-performance key-value store that enables thread-safe parallel reads and writes with minimal blocking. It replaces a single global mutex with per-shard locking, using hash-based key distribution to assign entries to independent segments, allowing multiple goroutines to operate simultaneously without race conditions. The library achieves its performance through fine-grained locking and a lock-free read path, where each shard operates independently with its own lock, enabling parallel reads and writ
Provides a thread-safe in-memory cache supporting parallel reads and writes without race conditions.
Goconcurrencyconcurrent-programminggo
Auf GitHub ansehen4,528
xacrimon/dashmap
xacrimon/dashmap
4,064Auf GitHub ansehen
DashMap ist eine Concurrent-Hash-Map für Rust, die ein thread-sicheres assoziatives Array für hochperformanten Multi-Thread-Zugriff bereitstellt. Sie dient als parallele Datenstruktur, die gleichzeitige Lese- und Schreibvorgänge ermöglicht, ohne dass ein globaler Lock erforderlich ist. Das Projekt verwendet eine Sharded-Lock-Architektur, um Thread-Konkurrenz zu reduzieren, und setzt feingranulare Locks auf Shard-Ebene ein. Es ist eine Serde-kompatible Map, die Serialisierung und Deserialisierung implementiert, um Map-Daten in gängige Formate zu konvertieren und daraus zurückzuführen. Die Bibliothek deckt Funktionen für parallele Datenspeicherung, Shared-State-Management und die Implementierung thread-sicherer Caches ab.
Provides the underlying primitives necessary to build thread-safe sharded caches.
Rustconcurrentconcurrent-data-structureconcurrent-map
Auf GitHub ansehen4,064
facebookincubator/cinder
facebookincubator/cinder
3,764Auf GitHub ansehen
Cinder is a high-performance Python runtime implementation based on CPython. It is designed as an execution environment optimized for large-scale distributed systems and cloud environments. The project integrates a distributed memory cache and an asynchronous memory layer to manage data across multiple network nodes. It also provides a native C extension framework for developing high-performance compiled modules that link directly into the interpreter memory space. The system covers capabilities for asynchronous data retrieval, large-scale execution, and the integration of embedded scripting
Provides a high-performance data layer via a scalable network of memory nodes.
Pythoncompilerinterpreterjit
Auf GitHub ansehen3,764
grantjenks/python-diskcache
grantjenks/python-diskcache
2,828Auf GitHub ansehen
This project is a disk-backed key-value store and persistent data structure library for Python. It provides a mechanism for persisting mappings, sets, and queues to the local filesystem to bypass memory limitations and cache expensive function results across threads and processes. The system serves as a cross-process synchronization tool, offering distributed locks, semaphores, and barriers to coordinate shared resource access. It implements advanced caching strategies such as probabilistic stampede prevention, sharded data partitioning to increase throughput, and least-recently-used eviction
Implements sharded data partitioning to divide the cache into multiple storage buckets, reducing write contention.
Pythoncachefilesystemkey-value-store
Auf GitHub ansehen2,828