3 रिपॉजिटरी
Systems for maintaining per-key state and timers to enable complex windowing and aggregation in data pipelines.
Distinct from State-Synchronized Timers: None of the candidates address distributed data processing state or event-time timers; they focus on UI animations or generic runtime timers.
Explore 3 awesome GitHub repositories matching data & databases · Stateful Processing Backends. Refine with filters or upvote what's useful.
Storm is a distributed stream processing framework and fault-tolerant compute engine designed for executing real-time continuous computations across a cluster of machines. It functions as a stateful stream processor and cluster topology manager, enabling the deployment and monitoring of distributed data flow configurations. The system ensures exactly-once semantics by utilizing transactional state management to guarantee that every message in a data stream is processed exactly one time. It further operates as a distributed RPC system, allowing for the integration of non-native languages throu
Maintains real-time state and distributed queries to enable stateful stream processing.
Apache Beam is a distributed data pipeline framework and unified data processing model designed to handle both bounded batch data and unbounded real-time streams. It provides a system for building scalable, data-parallel workflows that operate across compute clusters using a single programming model. The framework utilizes a cross-runner pipeline abstraction that decouples the data processing logic from the underlying execution backend, allowing the same pipeline to run on different distributed compute engines. It supports multi-language pipeline development by translating high-level code fro
Apache Beam implements stateful processing and event-time timers to handle complex windowing and aggregation logic.
Hazelcast is a distributed data platform that combines an in-memory data grid with a stream processing engine to support real-time analytics and event-driven applications. It functions as a partitioned, distributed key-value store that replicates data across cluster nodes to provide low-latency access and high availability. The platform also serves as a distributed SQL query engine, allowing users to execute standard SQL statements against both in-memory datasets and external data sources. What distinguishes Hazelcast is its use of a distributed consensus subsystem to maintain strongly consis
Maps events with the same key to the same processing task to ensure consistent stateful aggregation.