Why is umami-software/umami a recommended Event Data Processing GitHub Repositories repository?

Allows overriding default event timestamps to ensure accurate historical data reporting.

Why is plausible/analytics a recommended Event Data Processing GitHub Repositories repository?

Validates event ingestion by inspecting request headers and debug responses to ensure accurate data processing.

Why is vectordotdev/vector a recommended Event Data Processing GitHub Repositories repository?

Records the precise time of occurrence for each metric event to maintain accurate temporal ordering and historical analysis.

Why is dbt-labs/dbt-core a recommended Event Data Processing GitHub Repositories repository?

Specifies occurrence times for data records to enable incremental processing and advanced dataset comparison.

Why is any86/any-rule a recommended Event Data Processing GitHub Repositories repository?

Provides a regex pattern to validate timestamps in YYYYMMDD HH:mm:ss format.

Why is falcosecurity/falco a recommended Event Data Processing GitHub Repositories repository?

Processes raw event content prior to field extraction to prepare data for the detection engine.

Why is yelp/elastalert a recommended Event Data Processing GitHub Repositories repository?

Enables specifying the field and format for event timing to adjust query delays for non-real-time data.

Why is mystenlabs/sui a recommended Event Data Processing GitHub Repositories repository?

Converts raw binary blockchain event data into strongly-typed Rust structures for processing.

Why is hazelcast/hazelcast a recommended Event Data Processing GitHub Repositories repository?

Defines event occurrence times using source timestamps or ingestion time to manage temporal processing.

Why is countly/countly-server a recommended Event Data Processing GitHub Repositories repository?

Attaches a unique millisecond timestamp, local hour, day of week, and timezone offset to each event.

14 dépôts

Awesome GitHub RepositoriesEvent Data Processing

Tools for handling, timestamping, and normalizing incoming event streams for analytics.

Distinguishing note: Focuses on the ingestion and temporal normalization of event data rather than general database management.

Explore 14 awesome GitHub repositories matching data & databases · Event Data Processing. Refine with filters or upvote what's useful.

Trouvez les meilleurs dépôts grâce à l'IA.Nous recherchons les dépôts les plus pertinents grâce à l'IA.

umami-software/umami
umami-software/umami
37,285Voir sur GitHub
Umami is a self-hosted, privacy-focused web analytics platform designed to provide full control over infrastructure and user data. It captures website traffic and visitor behavior through anonymous tracking methods that avoid cookies, browser fingerprinting, and the storage of personally identifiable information. The platform distinguishes itself through a comprehensive suite of behavioral analysis tools, including session replays, heatmaps, and cohort-based retention reporting. It features a multi-tenant architecture that allows teams to manage multiple websites within a single, collaborativ
Allows overriding default event timestamps to ensure accurate historical data reporting.
TypeScriptanalyticsaudience-segmentationcharts
Voir sur GitHub37,285
plausible/analytics
plausible/analytics
24,245Voir sur GitHub
This project is an open-source, privacy-focused web analytics platform designed for high-throughput data ingestion and multi-tenant data management. It provides a cookie-less tracking engine that captures visitor interactions using ephemeral request metadata, ensuring comprehensive traffic visibility while maintaining strict privacy standards. The architecture utilizes an event-driven ingestion pipeline and aggregated metric storage to decouple data collection from processing, enabling efficient long-term retrieval and responsive dashboard performance. What distinguishes this platform is its
Validates event ingestion by inspecting request headers and debug responses to ensure accurate data processing.
Elixiranalyticschartsclickhouse
Voir sur GitHub24,245
vectordotdev/vector
vectordotdev/vector
22,071Voir sur GitHub
Vector is a high-performance observability data pipeline designed to collect, transform, and route logs, metrics, and traces across distributed infrastructure. It functions as a modular engine that decouples data ingestion from processing and transmission, utilizing a component-based architecture to connect diverse sources to multiple destinations. The project distinguishes itself through a focus on reliability and flow control. It implements backpressure-aware data movement to prevent data loss during traffic spikes and utilizes disk-backed event buffering to ensure durability during network
Records the precise time of occurrence for each metric event to maintain accurate temporal ordering and historical analysis.
Rusteventsforwarderhacktoberfest
Voir sur GitHub22,071
dbt-labs/dbt-core
dbt-labs/dbt-core
13,051Voir sur GitHub
dbt-core is a command-line framework for transforming data within a warehouse using modular SQL and version control. It functions as a data transformation engine that enables users to define data structures and business logic through declarative configuration files, which the system then compiles into executable code. By managing complex data dependencies through a directed acyclic graph, it ensures that transformation tasks execute in the correct order while maintaining a manifest-driven state to track lineage and execution history. The project distinguishes itself through an adapter-based d
Specifies occurrence times for data records to enable incremental processing and advanced dataset comparison.
Rustanalyticsbusiness-intelligencedata-modeling
Voir sur GitHub13,051
any86/any-rule
any86/any-rule
8,662Voir sur GitHub
Any-rule is a multi-platform regular expression tool that provides a curated catalog of over 70 ready-to-use patterns for validating and extracting common data formats. The project separates its static regex collection from editor-specific plugins, allowing the same pattern library to be accessed through VS Code, IntelliJ IDEA, Alfred Workflow, and a web interface. The tool enables keyword-based pattern retrieval, letting users search for the correct regex by typing descriptive terms rather than remembering exact syntax. It covers a broad range of validation needs including email addresses, U
Provides a regex pattern to validate timestamps in YYYYMMDD HH:mm:ss format.
TypeScriptawsomeexpressregex
Voir sur GitHub8,662
falcosecurity/falco
falcosecurity/falco
8,670Voir sur GitHub
Falco is an eBPF runtime security monitor and cloud native detection engine that identifies abnormal behavior and security threats across hosts and containers. It functions as a Linux kernel event auditor, capturing system calls and kernel events in real-time to detect malicious activity. The system distinguishes itself through a rule-based threat detection model that evaluates system activity against a library of community-maintained rules and custom security definitions. It enriches raw kernel events with container and Kubernetes metadata to provide observability into isolated environments
Processes raw event content prior to field extraction to prepare data for the detection engine.
C++cloud-nativecncfcncf-project
Voir sur GitHub8,670
yelp/elastalert
Yelp/elastalert
7,994Voir sur GitHub
ElastAlert is an alerting framework and query monitor for Elasticsearch. It functions as a real-time log monitoring tool and event notification engine that scans indices for specific patterns to trigger automated alerts when predefined rules are matched. The system distinguishes itself through specialized detection logic, including event spike detection, event frequency monitoring, field change tracking, and the identification of new terms within data fields. It handles notification noise via stateful alert suppression to prevent redundant messages and provides time-windowed aggregation to gr
Enables specifying the field and format for event timing to adjust query delays for non-real-time data.
Python
Voir sur GitHub7,994
mystenlabs/sui
MystenLabs/sui
7,612Voir sur GitHub
Sui is a blockchain platform featuring an object-centric state model and resource-oriented smart contracts. It utilizes parallel transaction execution to increase network throughput and supports programmable transaction blocks that bundle multiple operations into single atomic units. The platform distinguishes itself with a capability-based access control system and zero-knowledge login mechanisms, enabling users to authenticate via identity providers without seed phrases. It also implements deterministic object addressing to allow predictable state lookups and supports the creation of soulbo
Converts raw binary blockchain event data into strongly-typed Rust structures for processing.
Rustblockchaindistributed-ledger-technologymove
Voir sur GitHub7,612
hazelcast/hazelcast
hazelcast/hazelcast
6,570Voir sur GitHub
Hazelcast is a distributed data platform that combines an in-memory data grid with a stream processing engine to support real-time analytics and event-driven applications. It functions as a partitioned, distributed key-value store that replicates data across cluster nodes to provide low-latency access and high availability. The platform also serves as a distributed SQL query engine, allowing users to execute standard SQL statements against both in-memory datasets and external data sources. What distinguishes Hazelcast is its use of a distributed consensus subsystem to maintain strongly consis
Defines event occurrence times using source timestamps or ingestion time to manage temporal processing.
Javabig-datacachingdata-in-motion
Voir sur GitHub6,570
countly/countly-server
countly/countly-server
5,875Voir sur GitHub
Countly is a self-hosted product analytics and engagement platform that tracks user behavior across mobile, web, and desktop applications. It collects and analyzes device properties, user actions, and session lifecycle data to understand engagement patterns, while also providing crash reporting, push notification delivery, and A/B testing capabilities. The platform is designed for privacy-first deployment, with built-in consent management and the ability to run entirely on private infrastructure. The platform distinguishes itself through its comprehensive feature set that combines analytics w
Attaches a unique millisecond timestamp, local hour, day of week, and timezone offset to each event.
JavaScript
Voir sur GitHub5,875
cloudevents/spec
cloudevents/spec
5,801Voir sur GitHub
CloudEvents is an open specification for describing event data in a common format across cloud platforms and services. It defines a standard structure and set of metadata attributes for events, enabling interoperability across different systems so producers and consumers can exchange events without custom translation. The specification provides a protocol-agnostic serialization framework that maps CloudEvents attributes and payloads to multiple serialization formats including JSON, Avro, and Protobuf, and defines transport bindings for mapping events onto protocols like HTTP, AMQP, Kafka, MQTT
Provides the core capability to construct and validate CloudEvent objects against the specification.
Pythonserverlessspecification
Voir sur GitHub5,801
cortexproject/cortex
cortexproject/cortex
5,751Voir sur GitHub
Cortex is an open-source, horizontally scalable metrics platform that ingests, stores, and queries Prometheus-compatible time-series data with multi-tenant isolation. It accepts metrics via Prometheus remote write and OpenTelemetry, executes PromQL queries against both recent and historical data, and provides a Prometheus-compatible alerting and recording rule engine with an integrated Alertmanager. The system is built as a set of independently scalable microservices that use hash-ring-based sharding, gossip-based cluster membership, and tenant-aware object storage to distribute workloads acro
Cortex rejects samples with timestamps too far in the past or future based on configurable age and grace period limits.
Gocncfhacktoberfestkubernetes
Voir sur GitHub5,751
cri-o/cri-o
cri-o/cri-o
5,629Voir sur GitHub
CRI-O is an open-source container runtime that implements the Kubernetes Container Runtime Interface (CRI) to manage container images, pods, and containers on cluster nodes using OCI-compatible runtimes. It serves as a node-level container manager that handles image pulling, container lifecycle, and resource monitoring for Kubernetes clusters, running containers according to the Open Container Initiative specifications. The runtime distinguishes itself through live configuration reloading that applies changes to runtime definitions, registry mirrors, and TLS certificates without restarting th
Reports pod sandbox status timestamps in nanosecond resolution for evented PLEG compatibility.
Go
Voir sur GitHub5,629
open-telemetry/opentelemetry-collector-contrib
open-telemetry/opentelemetry-collector-contrib
4,758Voir sur GitHub
Ce projet fournit un pipeline de données d'observabilité conçu pour collecter, transformer et router les logs, métriques et traces depuis diverses sources vers des formats standardisés pour analyse. Il fonctionne comme une architecture de composants basée sur des plugins utilisant des récepteurs, processeurs et exportateurs modulaires pour déplacer les données de télémétrie à travers des chaînes de traitement séquentielles. Le système utilise un modèle de composants piloté par interface qui permet des connecteurs interchangeables et des extensions contribuées par la communauté. Il se distingue par un langage spécifique au domaine pour le filtrage de télémétrie, l'attribution de ressources basée sur les métadonnées pour la détection d'infrastructure, et la résolution dynamique de secrets depuis des gestionnaires cloud externes. Le collecteur couvre un large éventail de capacités incluant l'ingestion de télémétrie depuis des fournisseurs cloud et des bases de données, la transformation et la réagrégation de données, et l'exportation sécurisée vers des backends de stockage tiers. Il incorpore des fonctionnalités de gestion du trafic telles que le routage round-robin et le partitionnement de messages, ainsi que des primitives de sécurité pour la gestion des identités et des accès via OAuth2 et OIDC. Le projet inclut un framework d'assurance qualité pour la simulation de données synthétiques, les tests de performance de bout en bout et la vérification de l'intégrité des données.
Sets the start timestamp of cumulative metric points based on specific reset strategies.
Go
Voir sur GitHub4,758

Awesome Event Data Processing GitHub Repositories

umami-software/umami

plausible/analytics

vectordotdev/vector

dbt-labs/dbt-core

any86/any-rule

falcosecurity/falco

Yelp/elastalert

MystenLabs/sui

hazelcast/hazelcast

countly/countly-server

cloudevents/spec

cortexproject/cortex

cri-o/cri-o

open-telemetry/opentelemetry-collector-contrib

Explorer les sous-tags