19 个仓库
Functions for calculating scalar values like sums, counts, and averages over datasets.
Distinguishing note: Existing candidates focused on parallel prefix sums or ML loss, not standard SQL aggregation
Explore 19 awesome GitHub repositories matching data & databases · SQL Aggregate Functions. Refine with filters or upvote what's useful.
SQLite.swift is a type-safe Swift wrapper and object-relational mapping layer that provides a bridge for interacting with SQLite databases. It functions as a database driver that allows for embedded database management and local data persistence within Swift applications. The project distinguishes itself through a type-safe expression builder that verifies SQL statement syntax and intent at compile time. It includes specialized support for high-performance text matching via full-text search integration and provides mechanisms for securing sensitive data through database encryption. The libra
Calculates scalar values such as counts, sums, and averages across filtered datasets.
TextQL is a command line SQL query engine designed to execute relational queries directly against structured text files, such as CSV and TSV, without requiring a database import. It functions as a relational text file analyzer and a CSV processor that treats plain text files as virtual tables for filtering, joining, and aggregating data. The tool is built as a pipe-compatible data transformation utility, allowing it to process data from standard input and output formatted datasets. It enables relational joins across multiple files or directories within a single query to analyze relationships
Provides the ability to extend the query language with custom mathematical, string, and aggregate operations via shared libraries.
This project is a Go language driver for the SQLite database. It provides a relational database interface and a Cgo wrapper that connects Go applications to SQLite for persistent local data storage and query execution. The implementation serves as a provider for JSON document storage and local full-text search. It enables the creation, querying, and modification of JSON data and the implementation of searchable indexes for large text datasets directly within the database. The driver supports standard SQL query execution for both file-based and in-memory storage. It includes capabilities for
Allows the registration of Go functions as custom SQL scalar or aggregate functions via C-to-Go callbacks.
GRDB.swift is a comprehensive SQLite toolkit and object-relational mapper for Swift. It provides a database wrapper that handles local data persistence, connection management, and encrypted file storage for Apple platforms. The library features a dedicated observation framework that tracks database changes to automatically synchronize the application state and user interface in real time. It distinguishes itself with a type-safe query builder and a protocol-based mapping system that converts database rows into structured Swift objects. The toolkit covers a broad range of administrative and o
Supports registering custom Swift logic as SQL functions to extend the database's query capabilities.
AlaSQL is a JavaScript SQL database engine that allows for the filtering, grouping, and joining of in-memory object arrays and JSON data. It functions as an in-memory SQL database and client-side data processor, enabling the execution of SQL statements against JavaScript arrays and external data sources in both browser and server environments. The project serves as a universal data query tool capable of performing relational joins across diverse sources, such as merging Google Spreadsheets, SQLite files, and remote APIs into a single result set. It also acts as an IndexedDB SQL wrapper, allow
Enables the definition of custom scalar functions via JavaScript to perform specialized calculations in queries.
ToyDB is a distributed SQL database that provides a system for storing and querying data across multiple nodes. It focuses on maintaining strong consistency and fault tolerance through the implementation of a distributed consensus algorithm. The project distinguishes itself by supporting historical data versioning, enabling time-travel queries to retrieve the state of the database from a specific point in the past. It utilizes multi-version concurrency control to manage ACID transactions and ensure data integrity during concurrent operations. The system covers relational data modeling with t
Provides standard SQL aggregate functions for calculating sums, counts, and averages over datasets.
SQLiteStudio is an open-source graphical tool for browsing, editing, and managing SQLite database files. It combines a full-featured SQL editor with syntax highlighting, a visual database schema designer for creating entity-relationship diagrams, and a plugin-based extensibility platform that allows adding custom functionality through C/C++, JavaScript, Tcl, or Python. The application distinguishes itself through its multi-language scripting engine, which embeds JavaScript, Tcl, and Python interpreters to enable user-defined functions and scripts within SQL queries. It supports encrypted data
Adds user-defined functions written in C/C++ that can be called from SQL queries.
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Allows configuration of aggregation functions to compute results directly within the index.
Apache Hive is a SQL-on-Hadoop data warehouse that enables querying and managing petabytes of data stored in distributed storage such as HDFS and cloud storage services. It provides a familiar SQL interface for batch analytics and reporting, supported by a core set of components including the HiveServer2 Thrift service for remote query execution, the Hive Metastore Service for central metadata management, the Hive ACID Transaction Engine for concurrent read-write operations, and the Hive LLAP Interactive Engine for low-latency analytical processing. The WebHCat REST API offers an HTTP interfac
Adds user-defined functions, aggregates, and table functions to SQL for custom data processing.
GreptimeDB is a distributed, open-source time-series database built for unified observability. It stores and queries metrics, logs, and traces together in a single columnar engine, supporting both SQL and PromQL for analysis. The database is designed as a Kubernetes-native operator with a decoupled compute and storage architecture, enabling horizontal scaling and multi-region deployment. What distinguishes GreptimeDB is its role as a multi-protocol ingestion gateway, accepting data through OpenTelemetry, Prometheus Remote Write, InfluxDB, Loki, Elasticsearch, Kafka, and MQTT protocols without
Processes streaming data with continuous aggregation flows to produce downsampled results in real time.
Perfetto is a platform for system-level performance tracing and analysis on Linux and Android. It combines a high-throughput trace recorder, a SQL-based query engine, and a browser-based visualizer into a single toolchain. The platform covers CPU scheduling and call-stack profiling, native and Java heap memory allocation tracking, GPU and graphics events, and system-wide counters such as CPU frequency and power consumption. The architecture decouples trace recording from offline analysis, using a compact protobuf format for event encoding and columnar storage for efficient SQL queries. The we
Creates scalar or table-valued functions using a SQL SELECT statement for custom analysis logic.
这是一个关系型数据库速查表和 SQL 参考指南。它提供了一系列语法示例和查询文档,用于使用结构化查询语言管理关系型数据库。 该工具实现为一个带有客户端可搜索文档的静态网站,允许通过基于浏览器的索引即时过滤技术内容。 该参考涵盖了关系型数据库管理,包括数据检索、数据库模式管理和记录维护。它还包括关于通过表连接进行关系数据操作以及生成聚合报告的指导。
Documents the use of SQL aggregate functions to generate data summaries and reports.
Readyset is a transparent caching proxy for PostgreSQL and MySQL that sits between an application and its database, intercepting SQL queries and serving cached results from memory. It automatically caches query results on first execution and keeps those caches consistent by consuming the database’s replication stream in real time, enabling faster repeated reads without application code changes. The proxy also supports caching advanced SQL functions such as window functions, bucket functions, and locale-aware collation sorting, and exposes an interface that allows AI agents to inspect proxied q
Supports caching window functions, bucket functions, and locale-aware collation sorting for complex analytical queries.
Arroyo is a high-performance stream processing platform built in Rust. It executes continuous SQL queries on streaming data with event-time semantics, enabling accurate windowed aggregations, joins, and stateful computations on unbounded event streams. The platform uses native Rust execution for high throughput and low latency, with periodic checkpointing for exactly-once fault tolerance and horizontal scaling across distributed workers. The system integrates deeply with Kafka for reading and writing topics with exactly-once delivery and supports change data capture (CDC) from MySQL and Postg
Defines custom SQL functions in Rust or Python for use in streaming data pipelines.
H2 是一个用 Java 编写的 JDBC 兼容关系型数据库管理系统。它作为一个可嵌入的 SQL 数据库,可以直接在应用程序进程内运行以消除网络延迟,或者作为内存数据库用于高性能的易失性存储。它还包含一个基于 Web 的控制台,用于执行 SQL 命令和管理模式。 该系统的特点是其灵活的部署模式,包括用于远程 TCP/IP 访问的独立服务器模式,以及用于同时进行本地和远程连接的混合模式。它具有方言模拟层和兼容模式,允许其模仿其他数据库系统的行为和语法。 该引擎提供了一套广泛的功能,涵盖具有多版本并发控制(MVCC)的 ACID 事务、地理空间和 JSON 数据支持,以及高级分析窗口函数。它包括通过压缩备份、SQL 脚本恢复和堆外内存管理来处理大数据集的数据保护工具。 该数据库使用标准的 Java 数据库连接驱动程序和连接 URL 与应用程序集成。
Supports the creation of user-defined aggregate functions (UDAFs) by mapping them to source code.
sqlean 是一个实现为 C 语言共享库的 SQLite 扩展库合集。它提供了一套额外的标量和表值函数,扩展了 SQLite 数据库引擎的原生功能。 该项目为密码学、高等数学、网络和文件系统访问提供了专门的工具集。这些包括二进制哈希和编码、统计分析、IP 地址验证,以及将 CSV 文件或文件系统路径映射为虚拟表的能力。 该库还包括全面的文本处理工具,如正则表达式、模糊匹配和 Unicode 感知字符串操作。其他功能涵盖高精度日期和时间管理以及唯一标识符的生成。
Enables the creation of custom scalar user-defined functions to encapsulate reusable single-value logic.
Velox 是一个高性能 C++ 查询执行引擎和列式数据处理库。它作为一个用于实现分析型查询引擎的可组合框架,提供了向量化表达式评估器和数据管理系统工具包。 该项目以使用向量化列式执行和基于 Arena 的内存分配来处理大规模数据集而著称。它具有专门的优化功能,如广播连接表缓存、动态过滤器下推和字典编码,以减少内存开销并加速分析读取。 该引擎涵盖了广泛的分析能力,包括实现哈希连接、合并连接和半连接,以及多阶段并行聚合和窗口函数计算。它提供了用于列式内存存储、Parquet 数据解码以及与云存储集成的原语。 通过用于自定义标量和聚合函数的函数注册系统提供可扩展性,并提供高级绑定以将 C++ 逻辑连接到 Python。
Allows definition of new aggregation logic using vector interfaces and registration with specific type signatures.
Rusqlite is an embedded database interface and relational database driver that provides a client library for interacting with SQLite. It functions as an SQL query wrapper, enabling the management of local file-based or in-memory databases through a safe interface. The library allows for the extension of native database capabilities by implementing custom scalar functions, collations, and virtual tables. It also supports the embedding of the database engine directly into the application binary to remove external library dependencies. The project covers a broad range of capabilities including
Extends SQLite functionality by implementing custom scalar functions, collations, and virtual tables using Rust logic.
Drift is a type-safe SQL persistence library and relational mapper that provides a structured way to map database tables to classes and execute SQL queries with build-time validation. It functions as a type-safe query builder and a wrapper for SQLite and PostgreSQL, eliminating manual result set parsing by binding query outputs to native objects. The project distinguishes itself through a build-time code generation system that produces type-safe APIs and validates raw SQL statements against database versions before execution. It features reactive query streaming, which transforms SQL queries
Computes summary values like sums and counts using SQL grouping and window functions.