5 dépôts
Optimizes data processing for execution on a single compute node.
Distinguishing note: Focuses on single-node execution logic, distinct from distributed processing.
Explore 5 awesome GitHub repositories matching data & databases · Single-Node Processing. Refine with filters or upvote what's useful.
Polars is a high-performance columnar data processing library designed for efficient analytical workflows. It functions as a structured data library that organizes information into typed columns, utilizing the Apache Arrow memory format to enable zero-copy data sharing and cache-friendly, vectorized operations. The engine is built to handle large-scale tabular datasets, providing both local and distributed analytical runtimes that scale from single-machine environments to multi-node clusters. The project distinguishes itself through a sophisticated lazy query engine that constructs abstract e
Runs queries on a single compute node to simplify execution logic and avoid data shuffling overhead.
Tornado is a Python web framework and asynchronous networking library used to build scalable web applications and high-performance servers. It provides a non-blocking HTTP server capable of handling thousands of simultaneous connections. The project functions as a WebSocket server framework, enabling real-time bidirectional communication and persistent connections between clients and servers. It supports the implementation of custom networking protocols and high-performance networking services beyond standard HTTP. Its capabilities cover asynchronous web application development, concurrent A
Uses system calls like epoll and kqueue to manage multiple network sockets within a single process.
The Disruptor is a lock-free inter-thread messaging library and high-performance event bus. It implements a concurrent ring buffer designed for high-concurrency and low-latency message sequencing. The project utilizes a specific messaging architecture to eliminate lock contention, enabling high-throughput event routing and the exchange of continuous event streams between threads. It ensures strict first-in-first-out ordering and immediate data visibility across processing threads. The library provides capabilities for lock-free data streaming, sequential data ordering, and sequence-based eve
Uses a sequence claiming mechanism to allow multiple producers to write to the buffer without contention.
TigerBeetle is a distributed financial accounting database designed for high-volume transaction processing. It functions as a specialized transaction engine that enforces strict double-entry bookkeeping invariants, ensuring that every debit and credit is balanced and accounted for with absolute consistency. By utilizing a consensus-based replication model, the system provides high availability and data durability across geographically distributed clusters, making it suitable for mission-critical financial infrastructure. The system distinguishes itself through a performance-oriented architect
Processes all transactions sequentially on a single core to eliminate lock contention and ensure absolute consistency.
JCTools is a Java concurrency library providing a collection of lock-less and wait-free data structures. It serves as a toolkit for managing thread-safe data exchange, specifically designed to optimize high-throughput messaging and producer-consumer patterns in multi-threaded applications. The library distinguishes itself by implementing specialized queue structures that minimize contention and maximize throughput. By utilizing techniques such as cache-line padding, memory-barrier-based synchronization, and relaxed-consistency memory ordering, it avoids the performance bottlenecks often assoc
Specializes data structures for single-writer-single-reader scenarios to eliminate atomic operation overhead on the fast path.