59 repositorios
Persistence mechanisms that record data mutations to a log for durability and recovery.
Distinguishing note: Specific to log-based persistence for state recovery in distributed systems.
Explore 59 awesome GitHub repositories matching data & databases · Write-Ahead Logging. Refine with filters or upvote what's useful.
This project is a comprehensive Java backend engineering guide and technical reference focused on high-concurrency design, distributed systems, and microservices architecture. It provides detailed strategies for decomposing monolithic applications, managing service discovery, and implementing the architectural patterns required for scalable backend environments. The repository distinguishes itself through an extensive collection of big data algorithmic references and database scaling strategies. It covers memory-efficient techniques for analyzing massive datasets, such as Top-K element extrac
Provides mechanisms to record state-changing operations to a sequential disk log for durability and recovery.
LevelDB is an embedded database library and persistent storage engine that provides a sorted key-value store. It uses a log-structured merge-tree architecture to map byte arrays to values, running directly within a process to provide storage without the need for a separate server process. The system is distinguished by its use of custom comparison functions to define key ordering, enabling efficient range scans and sequenced lookups. It ensures data reliability through atomic batch execution, consistent snapshot generation, and log-based recovery after failures. The engine covers broad capab
Implements write-ahead logging to record every mutation for durability and crash recovery.
Qdrant is a high-performance vector similarity database designed to store, index, and search high-dimensional vectors alongside structured metadata. It functions as a distributed search engine that manages large-scale data clusters, providing low-latency retrieval and complex filtering capabilities. The system is built to serve as a specialized middleware layer, connecting machine learning pipelines and AI agents to persistent storage for intelligent information retrieval and recommendation tasks. The platform distinguishes itself through advanced retrieval techniques, including support for h
Records all incoming data modifications in a sequential log to guarantee durability.
RocksDB is a high-performance, embeddable persistent key-value library and storage engine based on Log-Structured Merge-trees. It is designed to provide durable storage for large-scale datasets, integrating directly into applications to manage data on flash and RAM-based hardware. The engine is distinguished by its focus on minimizing read and write amplification through multi-threaded compaction and custom memory allocators. It features specialized optimizations for flash storage, including support for zoned block devices, and provides the ability to extend store behavior via external plugin
Records every mutation to a sequential on-disk log to ensure data durability and recovery after crashes.
InfluxDB is a high-performance time-series database designed for collecting, storing, and querying time-stamped metrics and event data. It functions as a columnar time-series store and a real-time analytics engine, providing a network-accessible interface for retrieving and analyzing temporal records. The system utilizes a specialized columnar storage format to support high ingestion rates and efficient data retrieval. It incorporates a programmable runtime for executing custom plugins and triggers, including integration for processing and transforming incoming data streams. The platform cov
Records every incoming data point to a sequential append-only file to ensure durability.
This project is a reactive, offline-first NoSQL database engine designed for JavaScript applications. It provides a robust framework for managing application state by synchronizing data across browsers, mobile devices, and server-side runtimes. By treating local storage as the primary source of truth, it enables applications to remain functional without network connectivity, automatically reconciling changes with remote backends once a connection is restored. The database distinguishes itself through a modular architecture that supports cross-environment synchronization and high-performance d
Implements optimized write-ahead logging for high-performance read and write operations in IndexedDB.
RocketMQ is a distributed messaging and streaming platform designed for building event-driven applications. It serves as middleware to decouple services using publish-subscribe and request-reply patterns, and functions as a transactional messaging system that ensures atomicity by linking message delivery to local transaction outcomes. The platform includes specialized capabilities as a Kubernetes-native message broker for container orchestration environments and an MQTT broker for ingesting event data from mobile applications and hardware terminals. The system covers high-throughput data str
Utilizes a sequential write-ahead log for high-throughput message persistence and durable state recovery.
Turso is a distributed SQL database platform that provides managed, edge-hosted SQLite instances. It functions as a serverless database provider, enabling the deployment of relational databases that synchronize data across multiple geographic regions to support high availability and performance. The platform distinguishes itself by utilizing a fork of SQLite as its core storage engine, which supports both local file storage and remote network-based replication. It employs an edge-optimized proxy to route queries through a global network, minimizing latency by connecting users to the nearest d
Records database modifications to a sequential log to ensure crash recovery and data durability.
Sonic is a high-performance, lightweight search backend designed to provide real-time full-text search and autocomplete capabilities for applications. It functions as a persistent indexing server that maps text terms to object identifiers, allowing developers to integrate rapid search functionality without storing raw document content directly within the search engine. The system distinguishes itself through a specialized graph-based index that enables real-time word prediction and typo correction. Communication is handled via a custom, low-latency binary protocol over raw TCP sockets, which
Ensures data durability and crash recovery by recording index modifications to a write-ahead log.
PostgreSQL is an object-relational database management system designed for the persistent storage and retrieval of structured information. It functions as an ACID-compliant database server, utilizing standard query language protocols to maintain data consistency and reliability across large-scale application datasets. The system distinguishes itself through an extensible architecture that allows for the definition of custom data types, operators, and indexing methods. It employs multi-version concurrency control to enable simultaneous read and write operations without blocking, supported by a
Records data modifications to a sequential log before applying changes to ensure durability during system failures.
Pino is a high-performance logging library for Node.js applications designed to minimize overhead and prevent blocking the main event loop. It generates machine-readable logs using newline-delimited JSON, facilitating efficient ingestion and analysis by external monitoring and log aggregation platforms. The library distinguishes itself by offloading log processing and formatting to worker threads, ensuring that heavy logging tasks do not impact application responsiveness. It also provides a decoupled command-line utility that transforms structured production logs into human-readable text, sim
Writes log data directly to the output stream to minimize latency and avoid asynchronous event queue overhead.
QuestDB is a high-performance, distributed time-series database designed for the ingestion, storage, and analysis of massive datasets. It functions as a real-time analytics platform that utilizes a columnar storage engine to optimize disk input and output, enabling efficient analytical scans and complex windowing operations on streaming data. The platform distinguishes itself through specialized capabilities for handling asynchronous time-series streams, including advanced join algorithms that align disparate data sets based on precise timestamp lookups. It supports high-volume ingestion thro
Appends incoming data to a sequential log to ensure durability and crash recovery before merging into main storage.
Badger is an embeddable key-value store written in Go that provides persistent data storage for byte keys and values. It is a persistent database that utilizes a tiered LSM tree storage model to optimize disk storage and retrieval efficiency. The system features an ACID transaction engine that ensures data integrity through serializable snapshot isolation and multi-version concurrency control. It also provides an encrypted key-value store with data-at-rest encryption and a managed encrypted key registry to secure stored information. The engine covers a broad set of capabilities including hig
Uses a write-ahead log to record changes for durability and crash recovery.
Litestream is a database backup utility that provides continuous, incremental replication for SQLite databases. It operates as a background process that monitors local database files and streams modifications to remote cloud storage, ensuring that off-site backups are maintained without manual intervention. The tool functions by intercepting the database file system layer to capture page-level changes and tailing the write-ahead log. This approach allows for real-time synchronization of transactions to various cloud object storage providers through a unified abstraction layer. Beyond continu
Captures database modifications by continuously tailing the write-ahead log for real-time replication.
MySQL Server is a relational database management system designed to organize and store structured information. It functions as a comprehensive SQL server platform that provides reliable transactional integrity and high-performance query execution for enterprise data management. The system distinguishes itself through a pluggable storage engine architecture that decouples logical query processing from physical data storage, allowing for specialized handling of diverse workloads. It maintains data consistency and high concurrency through multi-version concurrency control and write-ahead logging
Records all data modifications to a persistent log file before applying changes to the database to guarantee durability during system failures.
Redis is a high-performance in-memory key-value store that functions as a distributed cache, message broker, and NoSQL database. It provides sub-millisecond read and write access to data stored in RAM and can operate as a vector database for indexing high-dimensional embeddings. The system supports a wide range of data storage and synchronization primitives, including the management of strings, hashes, lists, sets, and JSON documents. It enables real-time data operations through atomic transactions, hybrid persistence using snapshots and append-only logs, and high-availability configurations
Implements a durable append-only file that records every write operation to ensure data recovery after failure.
SQLite es un motor de base de datos relacional sin servidor y una biblioteca basada en C que almacena datos en un único archivo de disco local. Funciona como una base de datos SQL embebida, integrándose directamente en las aplicaciones sin necesidad de un proceso de servidor independiente. El motor incluye capacidades especializadas para la indexación de búsqueda de texto completo y consultas de datos espaciales utilizando estructuras R-Tree para rangos de coordenadas geográficas o geométricas. El sistema proporciona un amplio soporte para la manipulación de datos SQL, recuperación y reparación de bases de datos, y seguimiento de cambios para sincronizar modificaciones entre bases de datos. También cuenta con una interfaz basada en terminal para la gestión y configuración de la base de datos.
Employs write-ahead logging to ensure atomic commits and durability by recording changes before applying them to the main database.
Cassandra is a distributed NoSQL database and wide-column store designed for high availability and linear scalability. It functions as a fault-tolerant distributed system that utilizes an LSM-tree storage engine to optimize write throughput and manage massive datasets. The system is a CQL-compliant database, using a structured query language to manage and retrieve tabular data stored across multiple nodes. It organizes information into rows and columns based on a flexible schema and primary keys. The project provides capabilities for horizontal database scaling, distributed data partitioning
Implements a commit log to ensure all mutations are persisted to disk before memory updates for crash recovery.
LiteDB is a serverless, embedded NoSQL document database for .NET applications. It persists data into a single portable file, functioning as a BSON data store that resides within the application process rather than running as a separate server. The system is ACID compliant, utilizing write-ahead logging to ensure atomic, consistent, isolated, and durable transactions. It includes built-in encryption to provide secure local data storage and protect files on disk from unauthorized access. The project covers object-document mapping to convert classes into document formats, indexed search capabi
Ensures ACID compliance and data durability by recording mutations to a write-ahead log.
LiteDB is a serverless NoSQL document store and embedded database engine for .NET applications. It persists unstructured documents and binary data into a single standalone disk file, allowing the database to run within the application process rather than as a separate server. The system supports strongly typed queries through Language Integrated Query and allows the execution of standard SQL commands for data retrieval and transformation. It provides native mapping of plain classes into document formats and secures stored information via symmetric-key file encryption. The engine includes cap
Employs write-ahead logging to record mutations and ensure atomic transactions and durability.