5 repository-uri
Algorithms for combining sorted sequences into a single sorted sequence using parallel processing.
Distinct from Parallel Processing: Distinct from general parallel processing: focuses on the specific merge operation.
Explore 5 awesome GitHub repositories matching data & databases · Parallel Data Merging. Refine with filters or upvote what's useful.
RxJava is a reactive stream processing framework and JVM reactive extensions library. It serves as an asynchronous dataflow orchestrator used to compose event-based programs by transforming, combining, and consuming real-time data flows on the Java Virtual Machine. The project distinguishes itself through integrated backpressure flow control, which manages the emission rate between producers and consumers to prevent memory exhaustion. It further provides mechanisms for concurrent thread management and parallel data processing to offload blocking operations and maintain application responsiven
Supports executing independent data flows in parallel and merging the results back into a single sequence.
Taskflow is a C++ task-parallel framework designed to build high-performance parallel workflows and complex dependency graphs. It provides a programming model that organizes computational work into directed acyclic graphs, enabling developers to manage concurrency, resource scheduling, and task dependencies across multi-core CPUs and GPU accelerators. The framework distinguishes itself through its ability to orchestrate heterogeneous systems, allowing for the integration of hardware-accelerated kernels and memory operations into unified execution pipelines. It supports dynamic runtime subflow
Combines two sorted sequences into a single sorted sequence using parallel processing to improve throughput.
Metaflow is a Python machine learning framework and MLOps workflow orchestrator designed to manage the lifecycle of data pipelines from local prototyping to production. It serves as a distributed compute manager and an experiment tracking system, enabling the creation of reproducible pipelines that transition between development and high-availability production environments. The framework distinguishes itself through an integrated checkpointing system that automatically persists intermediate data artifacts to remote storage, allowing failed runs to be resumed from the last successful step. It
Resolves and propagates data artifacts from multiple parallel branches into a single join step.
Reactor Core este un toolkit de programare reactivă și o fundație non-blocking pentru compunerea pipeline-urilor de date asincrone pe JVM. Servește drept framework de procesare a fluxurilor asincrone și sistem de gestionare a backpressure-ului, permițând dezvoltatorilor să transforme, filtreze și combine secvențe de evenimente în timp ce reglează fluxul de date între producători și consumatori pentru a preveni epuizarea resurselor. Biblioteca se diferențiază printr-un sistem sofisticat de programare a concurenței și control al fluxului bazat pe cerere. Decuplează procesarea semnalelor de firele de execuție specifice folosind un registru de scheduler și oferă mecanisme pentru propagarea metadatelor imutabile conștiente de context peste limitele asincrone. De asemenea, dispune de instrumente specializate pentru capturarea urmelor la momentul asamblării și programarea timpului virtual pentru a facilita testarea operatorilor bazați pe timp. Proiectul acoperă o gamă largă de capabilități, inclusiv procesarea funcțională a datelor pentru agregarea și windowing-ul secvențelor, o varietate de strategii de recuperare a erorilor precum reîncercările cu backoff exponențial și utilitare pentru a conecta API-urile legacy de tip callback sau sincrone în fluxuri reactive. Oferă, de asemenea, instrumente pentru monitorizarea pipeline-ului și o suită de instrumente de testare pentru verificarea secvențelor de semnale.
Combines results from multiple parallel processing rails back into a single sequential stream.
ZIO is a functional effect system for the JVM that models asynchronous and concurrent programs as pure, composable values with typed error handling and dependency injection. Its core identity is built on fiber-based concurrency, where lightweight, non-blocking fibers execute millions of concurrent tasks with structured lifecycle management, and a dual-channel error model that separates expected business failures from unexpected system defects at compile time. The system provides effect-typed dependency injection through a layer-based dependency graph, pull-based reactive stream processing with
Maps each element to a new channel and runs all inner channels concurrently, merging their outputs.