10 Repos
Storage systems that span multiple servers to provide unified access to data.
Distinguishing note: Focuses on storage architecture rather than general database management.
Explore 10 awesome GitHub repositories matching data & databases · Distributed Filesystems. Refine with filters or upvote what's useful.
Dieses Projekt ist ein von der Community kuratiertes Verzeichnis von Open-Source-Software, die für den Einsatz in privaten Serverumgebungen und Home-Labs konzipiert ist. Es dient als umfassende Ressource zur Entdeckung unabhängiger, selbst gehosteter Alternativen zu gängigen Cloud-Diensten und ermöglicht es Nutzern, die volle Datenhoheit und Kontrolle über ihre digitale Infrastruktur zu behalten. Das Verzeichnis ist durch eine hierarchische Taxonomie strukturiert, die eine riesige Sammlung von Anwendungen in logische Kategorien organisiert, von Medienmanagement und Datenanalyse bis hin zu privater Kommunikation und Tools für die Teamproduktivität. Es zeichnet sich durch einen kollaborativen Peer-Review-Prozess aus, bei dem Community-Mitglieder die Qualität und Relevanz jeder Einreichung validieren, um sicherzustellen, dass das Verzeichnis korrekt und zuverlässig bleibt. Das Projekt deckt ein breites Spektrum an Fähigkeiten ab, einschließlich Infrastruktur-Automatisierung, containerbasierter Service-Bereitstellung und deklarativem Konfigurationsmanagement. Diese Tools unterstützen Nutzer bei der Aufrechterhaltung reproduzierbarer Serverumgebungen und der Verwaltung komplexer Service-Abhängigkeiten auf privater Hardware. Das Verzeichnis wird als versionskontrolliertes Repository gepflegt, wodurch sichergestellt wird, dass alle Updates und Community-gesteuerten Änderungen nachverfolgt und transparent sind.
Supports multiple access protocols and allows for the seamless addition of storage capacity to handle large volumes of small files.
This project is a community-curated directory of open-source tools and resources designed to assist system administrators with infrastructure management. It functions as a centralized knowledge base, providing a structured index of software and documentation that helps professionals discover solutions for automating, monitoring, and maintaining distributed computing environments. The repository distinguishes itself through a collaborative, community-driven structure that organizes a vast array of technical resources into a hierarchical taxonomy. By utilizing hyperlink-centric navigation, it d
Deploys storage solutions across multiple servers to ensure high availability.
Coordinates storage across multiple servers to provide a unified file access layer.
This project serves as a comprehensive technical reference for the architecture and design of data-intensive applications. It provides a structured analysis of the fundamental principles required to build reliable, scalable, and maintainable software systems, covering the core trade-offs inherent in modern data infrastructure. The repository explores the mechanics of distributed data management, including strategies for replication, partitioning, and achieving consensus across multiple nodes. It details the design of storage engines, indexing techniques, and transaction management models, whi
Distributes large files across nodes to enable high-throughput access and fault tolerance.
go-ipfs is an implementation of an IPFS node, providing a distributed filesystem and a content-addressable storage system. It enables the storage and retrieval of data based on unique cryptographic hashes rather than fixed network locations, allowing files to be shared across a peer-to-peer network without a central authority. The system utilizes a distributed hash table and a peer-to-peer gossip protocol to route requests and propagate network state and metadata. It organizes data using a Merkle DAG structure to support efficient deduplication and versioning of content. Capabilities include
Provides a decentralized storage system that spans multiple servers for unified data access.
LibSQL is a high-performance, distributed SQL database engine that extends SQLite to support remote network access, edge computing, and real-time synchronization. It functions as an embedded database library that integrates directly into application processes while providing the infrastructure to maintain consistency across multiple geographic regions. The platform distinguishes itself by enabling database interaction over standard HTTP protocols, allowing applications to query remote data sources in serverless and edge environments without requiring local filesystem access. It includes nativ
Replicates local filesystem state to remote databases for backups and multi-machine collaboration.
Btrfs for Windows is a kernel-mode driver and filesystem manager that enables read and write access to Btrfs formatted drives on Windows operating systems. It implements the Linux Btrfs on-disk format, providing a bridge for native filesystem interaction including a dedicated integration for the Windows Subsystem for Linux. The project distinguishes itself through an identity mapping layer that translates Linux user and group IDs into Windows security identifiers to maintain file ownership and permissions. It further integrates with the Windows environment via a shell extension for managing s
Transfers subvolume data to clones or remote targets for backup and filesystem state migration.
TubeArchivist is a self-hosted YouTube video archiving system and metadata indexer. It functions as a personal media library and download manager that allows users to create a searchable offline collection of videos, channels, and playlists. The system distinguishes itself by indexing subtitles, comments, and channel information for full-text search and retrieval. It features automated media synchronization to track subscriptions and playlists, ensuring new content is automatically queued and downloaded as it is published. The project provides a broad set of capabilities for digital asset ma
Reconciles physical media files and thumbnails with database records to ensure index integrity and recover missing assets.
mergerfs is a FUSE-based union filesystem that pools multiple independent filesystems or directories into a single unified mount point. It acts as a proxy to underlying storage, forwarding file operations directly to the filesystem for near-native performance while merging directory listings and attribute changes. The project provides a live, read-write pooled view of storage that aggregates drives of any size without requiring reformatting or data redistribution, and it isolates individual drive failures so that the pool continues serving data from remaining filesystems. The filesystem offer
Implements synchronous POSIX call behavior that halts all threads when an underlying branch filesystem blocks.
Pigsty is a full-stack orchestration suite for deploying, monitoring, and managing high-availability PostgreSQL clusters and their supporting infrastructure. It functions as a cluster management platform and high-availability suite that automates failover, manages virtual IPs, and ensures data consistency through distributed consensus. The project distinguishes itself by providing a comprehensive database infrastructure-as-code framework and a dedicated observability stack. It incorporates a backup and recovery manager supporting point-in-time recovery via S3-compatible object storage, alongs
Sets up a shared POSIX filesystem that utilizes a PostgreSQL instance as the metadata engine.