4 dépôts
Execution of differential computations like aggregations and joins to maintain up-to-date streaming views.
Distinct from Incremental Data Streaming: Focuses on the execution of differential logic (joins/aggs) rather than just memory-efficient streaming of data.
Explore 4 awesome GitHub repositories matching data & databases · Incremental Computation. Refine with filters or upvote what's useful.
RisingWave is a cloud-native streaming database and real-time analytics engine that uses standard SQL to process continuous data streams. It functions as a streaming data lakehouse, combining the capabilities of a streaming SQL database with a platform that integrates streaming ingestion with open table formats. The system is distinguished by its use of the PostgreSQL wire protocol, allowing it to integrate with existing SQL tools and drivers. It employs a decoupled compute and storage architecture, persisting streaming state and materialized views in cloud object storage to enable independen
Executes incremental aggregations and joins to maintain real-time views of streaming data.
Fast n-dimensional filtering and grouping of records.
Computes histograms and top-K lists incrementally as filter conditions change, avoiding full recomputation.
Cocoindex is an incremental data processing engine that builds and maintains live indexes for AI agents, with a core focus on codebase indexing and knowledge graph extraction. The engine uses a function-graph execution model where user-defined Python functions are composed into a directed acyclic graph, and it processes data incrementally so only changed source records or code paths are re-computed, avoiding full recomputation at any scale. It supports automatic schema inference from transformation pipeline type annotations and provides full data lineage tracing, tagging every output record wi
Processes data changes incrementally so only modified content is re-computed, keeping large corpora fresh without full recomputation.
Stumpy est une bibliothèque Python pour l'analyse de séries temporelles évolutive centrée sur l'implémentation d'algorithmes de profil de matrice. Elle fournit un framework pour calculer des profils de distance afin d'identifier des modèles répétitifs et des anomalies au sein des données de séries temporelles. Le projet se distingue par sa capacité à mettre à l'échelle des calculs lourds sur du matériel GPU et des clusters distribués utilisant Dask. Il prend en charge l'analyse multidimensionnelle pour découvrir des motifs à travers des flux de données concurrents et offre un calcul incrémentiel pour l'analyse de streaming en temps réel. La bibliothèque couvre un large éventail de techniques d'exploration de séries temporelles, incluant la découverte de motifs, la détection d'anomalies et la correspondance de modèles de séquence. Elle fournit également des outils pour la segmentation sémantique afin de détecter les changements de régime et l'extraction de chaînes temporellement ordonnées de modèles de sous-séquences similaires.
Calculates matrix profiles incrementally as new data arrives to monitor time series in real time.