4 repositorios
Execution of differential computations like aggregations and joins to maintain up-to-date streaming views.
Distinct from Incremental Data Streaming: Focuses on the execution of differential logic (joins/aggs) rather than just memory-efficient streaming of data.
Explore 4 awesome GitHub repositories matching data & databases · Incremental Computation. Refine with filters or upvote what's useful.
RisingWave is a cloud-native streaming database and real-time analytics engine that uses standard SQL to process continuous data streams. It functions as a streaming data lakehouse, combining the capabilities of a streaming SQL database with a platform that integrates streaming ingestion with open table formats. The system is distinguished by its use of the PostgreSQL wire protocol, allowing it to integrate with existing SQL tools and drivers. It employs a decoupled compute and storage architecture, persisting streaming state and materialized views in cloud object storage to enable independen
Executes incremental aggregations and joins to maintain real-time views of streaming data.
Fast n-dimensional filtering and grouping of records.
Computes histograms and top-K lists incrementally as filter conditions change, avoiding full recomputation.
Cocoindex is an incremental data processing engine that builds and maintains live indexes for AI agents, with a core focus on codebase indexing and knowledge graph extraction. The engine uses a function-graph execution model where user-defined Python functions are composed into a directed acyclic graph, and it processes data incrementally so only changed source records or code paths are re-computed, avoiding full recomputation at any scale. It supports automatic schema inference from transformation pipeline type annotations and provides full data lineage tracing, tagging every output record wi
Processes data changes incrementally so only modified content is re-computed, keeping large corpora fresh without full recomputation.
Stumpy es una librería de Python para análisis de series temporales escalable centrada en la implementación de algoritmos de perfil de matriz (matrix profile). Proporciona un framework para calcular perfiles de distancia para identificar patrones repetitivos y anomalías dentro de datos de series temporales. El proyecto se distingue por su capacidad para escalar cálculos pesados a través de hardware GPU y clusters distribuidos utilizando Dask. Admite análisis multidimensional para descubrir motivos a través de flujos de datos concurrentes y ofrece computación incremental para análisis de streaming en tiempo real. La librería cubre una amplia gama de técnicas de minería de series temporales, incluyendo descubrimiento de motivos, detección de anomalías y coincidencia de patrones de secuencia. También proporciona herramientas para segmentación semántica para detectar cambios de régimen y la extracción de cadenas ordenadas temporalmente de patrones de subsecuencia similares.
Calculates matrix profiles incrementally as new data arrives to monitor time series in real time.