Why is ray-project/ray a recommended Distributed Object Stores GitHub Repositories repository?

A shared memory system that enables efficient data sharing and asynchronous communication between workers across a cluster.

Why is seaweedfs/seaweedfs a recommended Distributed Object Stores GitHub Repositories repository?

Manages billions of files by decoupling metadata management from raw data storage nodes.

Why is happyfish100/fastdfs a recommended Distributed Object Stores GitHub Repositories repository?

Employs a distributed object store architecture using unique identifiers for high-speed retrieval of unstructured data.

Why is cubefs/cubefs a recommended Distributed Object Stores GitHub Repositories repository?

Functions as a distributed object store for unstructured content across datacenters and hybrid clouds.

Why is kvcache-ai/mooncake a recommended Distributed Object Stores GitHub Repositories repository?

Implements shared memory or storage systems for high-performance distribution of short-lived data objects like checkpoints.

Why is ag2ai/faststream a recommended Distributed Object Stores GitHub Repositories repository?

Stores large binary objects in a distributed store and notifies consumers of changes for event-driven updates.

6 Repos

Awesome GitHub RepositoriesDistributed Object Stores

Shared memory or storage systems designed for high-performance data access across cluster nodes.

Distinguishing note: No candidates provided; specifically addresses shared memory for worker communication.

Explore 6 awesome GitHub repositories matching data & databases · Distributed Object Stores. Refine with filters or upvote what's useful.

Finde die besten Repos mit KI.Wir suchen mit KI nach den am besten passenden Repositories.

ray-project/ray
ray-project/ray
42,895Auf GitHub ansehen
Ray is a distributed computing framework designed to scale Python and Java applications across clusters by abstracting task scheduling and resource management. It functions as a resource-aware execution engine that manages task dependencies, placement, and fault tolerance across networked compute nodes. At its core, the system provides a stateful actor model, allowing developers to define classes that run in dedicated processes to maintain and mutate internal state across remote method calls. The framework distinguishes itself through a robust cross-language interoperability layer, enabling f
A shared memory system that enables efficient data sharing and asynchronous communication between workers across a cluster.
Pythondata-sciencedeep-learningdeployment
Auf GitHub ansehen42,895
seaweedfs/seaweedfs
seaweedfs/seaweedfs
32,937Auf GitHub ansehen
SeaweedFS is a distributed object store and high-performance file system designed to manage massive volumes of unstructured data. It utilizes a decoupled architecture that separates metadata management from raw data storage, allowing for independent scalability and the efficient handling of billions of files. By providing a POSIX-compliant interface, it enables applications to interact with a unified namespace while maintaining the performance characteristics of a distributed object store. The system distinguishes itself through a multi-region data fabric that supports active-active replicati
Manages billions of files by decoupling metadata management from raw data storage nodes.
Goblob-storagecloud-drivedistributed-file-system
Auf GitHub ansehen32,937
happyfish100/fastdfs
happyfish100/fastdfs
9,231Auf GitHub ansehen
FastDFS is a distributed file system and object store designed as a high-capacity file server. It functions as a cluster storage manager that saves, syncs, and accesses large volumes of unstructured data across a network of distributed servers. The system uses unique identifiers for file retrieval and indexing instead of traditional hierarchical naming to avoid metadata bottlenecks. It manages file attributes through key-value metadata mapping and employs a distributed replication model to ensure high availability and data redundancy across storage groups. The project provides capabilities f
Employs a distributed object store architecture using unique identifiers for high-speed retrieval of unstructured data.
Cdistributed-file-storagedistributed-file-systemstorage-servers
Auf GitHub ansehen9,231
cubefs/cubefs
cubefs/cubefs
5,593Auf GitHub ansehen
CubeFS ist ein verteiltes Cloud-Speichersystem, das für die Verwaltung von Datei- und Objektspeichern über Rechenzentren und hybride Clouds hinweg entwickelt wurde. Es fungiert als mandantenfähiges verteiltes Dateisystem und Objektspeicher, der Daten im Exabyte-Maßstab verarbeiten kann und eine verteilte Architektur zur Speicherung unstrukturierter Inhalte nutzt. Das System zeichnet sich durch eine Multi-Protokoll-Schnittstellenebene aus, die den gleichzeitigen Datenzugriff über S3-, POSIX- und HDFS-Schnittstellen ermöglicht. Es verwendet eine entkoppelte Compute-Storage-Architektur, um Verarbeitung und Persistenz unabhängig voneinander zu skalieren, und implementiert fein abgestimmte Isolationsrichtlinien, um Ressourcen und Daten zwischen verschiedenen Mandanten zu trennen. Die Zuverlässigkeit wird durch konfigurierbare Redundanzstrategien verwaltet, einschließlich Multi-Replica-Mirroring und Erasure Coding. Die Plattform beinhaltet ein Multi-Tier-Caching-System zur Beschleunigung des Datenzugriffs und integriert sich via Container Storage Interface-Treiber in Kubernetes, um die Bereitstellung persistenter Volumes zu automatisieren.
Functions as a distributed object store for unstructured content across datacenters and hybrid clouds.
Goai-native-storagecloud-native-storagecloud-storage
Auf GitHub ansehen5,593
kvcache-ai/mooncake
kvcache-ai/Mooncake
5,594Auf GitHub ansehen
Mooncake ist eine disaggregierte Plattform für das Serving von Large Language Models und ein verteilter Key-Value-Store, der für eine hochperformante Inferenz-Infrastruktur konzipiert wurde. Es fungiert als GPU-Speicher-Orchestrator und KV-Cache-Managementsystem, das Key-Value-Caches über Cluster hinweg bündelt und überträgt, um die Inferenz zu beschleunigen. Das System zeichnet sich dadurch aus, dass es die Prefill- und Decode-Phasen der Inferenz in separate Hardware-Cluster trennt, um die Ressourcennutzung zu optimieren. Es nutzt einen hochperformanten verteilten RDMA-Cache mit Zero-Copy-Transfers, um Daten zwischen Rechenknoten zu verschieben und dabei die CPU zu umgehen, um Latenz und Overhead zu reduzieren. Die Plattform deckt breite Funktionsbereiche ab, einschließlich verteiltem Memory-Pooling, Beschleuniger-Speicher-Routing via CXL und Multi-Tier-Storage-Offloading auf SSDs. Es verwaltet den Cluster-Status durch Metadaten-Koordinationsdienste und implementiert Ressourcen-Governance mittels lease-basierter Objektschutzmechanismen und wasserzeichenbasierter Cache-Eviction. Die Software ist für containerisierte Deployments verpackt, mit Unterstützung für Host-Networking und Hardware-Device-Mapping.
Implements shared memory or storage systems for high-performance distribution of short-lived data objects like checkpoints.
C++
Auf GitHub ansehen5,594
ag2ai/faststream
ag2ai/faststream
4,967Auf GitHub ansehen
FastStream is an asyncio message broker framework for building event-driven applications in Python. It provides a unified interface and a multi-broker messaging abstraction layer that translates generic producer and consumer calls into broker-specific APIs. The framework features a built-in dependency injection container and uses decorators to route messages to asynchronous handler functions. It includes a documentation generator that extracts channel definitions and message formats from code to produce standardized AsyncAPI specifications. The project supports integration with Kafka, Rabbit
Stores large binary objects in a distributed store and notifies consumers of changes for event-driven updates.
Pythonasyncapiasynciodistributed-systems
Auf GitHub ansehen4,967

Awesome Distributed Object Stores GitHub Repositories

ray-project/ray

seaweedfs/seaweedfs

happyfish100/fastdfs

cubefs/cubefs

kvcache-ai/Mooncake

ag2ai/faststream

Unter-Tags erkunden