6 Repos
Techniques for reclaiming disk space through defragmentation and reducing write amplification.
Distinct from Disk Management Utilities: Distinct from Disk Management Utilities: focuses on internal database storage efficiency rather than external OS-level disk tools.
Explore 6 awesome GitHub repositories matching data & databases · Storage Space Optimization. Refine with filters or upvote what's useful.
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
Permanently removes soft-deleted rows and optimizes data files to reclaim storage space.
Sled is an embedded key-value store and ACID-compliant database designed for high-performance data persistence. It functions as a log-structured storage engine that organizes data using B+ trees to support efficient range queries and prefix scans. The engine implements a zero-copy data store model, utilizing epoch-based reclamation to provide direct references to cached values without memory allocations. It distinguishes itself through a combination of write-ahead logging, page cache optimizations to reduce write amplification on flash storage, and serializable transactions for atomic multi-k
Implements disk defragmentation and page cache optimizations to reduce write amplification on flash storage.
This project is a pure Go implementation of the Git version control system, providing a library for integrating versioning and history analysis into applications. It functions as a complete repository manager and object store that does not require external binary dependencies. The implementation utilizes interface-based storage, allowing repositories to be managed on disk or entirely in memory. It supports a transactional storage model to ensure atomic operations and implements a content-addressable storage system using delta-compression packfiles. The library covers a broad range of version
Reclaims disk space by compressing loose objects into packfiles and removing unreferenced data.
fio is a storage performance benchmarking tool and synthetic I/O workload generator. It functions as a storage device profiler and I/O trace replay engine, enabling the measurement of throughput and latency for storage devices and file systems. The project is distinguished by its ability to act as a distributed storage stress tester, managing multiple remote server backends via a single controller to evaluate network storage. It also includes specialized capabilities for storage deduplication analysis by generating redundant data buffers to test the efficiency of deduplication subsystems. Th
Reserves disk space for files to ensure contiguous blocks and prevent out-of-space errors.
Cuberite ist ein hochperformanter Multiplayer-Spieleserver für Java-Edition-Clients, der für eine ressourcenschonende Umgebung (niedriger RAM- und CPU-Verbrauch) zum Hosten geteilter virtueller Räume entwickelt wurde. Der Server ist für den plattformübergreifenden Einsatz auf verschiedenen Betriebssystemen und Hardwaretypen ausgelegt. Er ermöglicht die Erweiterung von Spielmechaniken und Serverlogik über eine Lua-Skriptschnittstelle, wodurch Funktionsänderungen ohne Neukompilierung der Core-Engine möglich sind. Das Projekt enthält Tools für die Serveradministration über eine Remote-Konsole sowie für das Welt-Datenmanagement zur Analyse von Statistiken und zur Optimierung der Speicherdateien. Zusätzliche Funktionen umfassen die Visualisierung der Biome-Generierung und die Analyse von Memory-Dumps zur Ressourcenüberwachung.
Reclaims disk space and increases loading speeds through save file defragmentation and removal of unused data.
Nominatim is a self-hosted geospatial search engine and geocoding server that utilizes OpenStreetMap data. It provides a complete infrastructure for forward geocoding, converting addresses or place names into geographic coordinates, and reverse geocoding, translating coordinates into human-readable physical addresses. The project features a dedicated data importer that parses raw map data into a PostgreSQL geospatial database. It distinguishes itself through a configurable import pipeline that uses style files to filter map features and an importance-based ranking system to prioritize search
Reduces disk space and import time through the use of simplified coordinate files and index disabling.