19 مستودعات
Functions for calculating scalar values like sums, counts, and averages over datasets.
Distinguishing note: Existing candidates focused on parallel prefix sums or ML loss, not standard SQL aggregation
Explore 19 awesome GitHub repositories matching data & databases · SQL Aggregate Functions. Refine with filters or upvote what's useful.
SQLite.swift is a type-safe Swift wrapper and object-relational mapping layer that provides a bridge for interacting with SQLite databases. It functions as a database driver that allows for embedded database management and local data persistence within Swift applications. The project distinguishes itself through a type-safe expression builder that verifies SQL statement syntax and intent at compile time. It includes specialized support for high-performance text matching via full-text search integration and provides mechanisms for securing sensitive data through database encryption. The libra
Calculates scalar values such as counts, sums, and averages across filtered datasets.
TextQL is a command line SQL query engine designed to execute relational queries directly against structured text files, such as CSV and TSV, without requiring a database import. It functions as a relational text file analyzer and a CSV processor that treats plain text files as virtual tables for filtering, joining, and aggregating data. The tool is built as a pipe-compatible data transformation utility, allowing it to process data from standard input and output formatted datasets. It enables relational joins across multiple files or directories within a single query to analyze relationships
Provides the ability to extend the query language with custom mathematical, string, and aggregate operations via shared libraries.
This project is a Go language driver for the SQLite database. It provides a relational database interface and a Cgo wrapper that connects Go applications to SQLite for persistent local data storage and query execution. The implementation serves as a provider for JSON document storage and local full-text search. It enables the creation, querying, and modification of JSON data and the implementation of searchable indexes for large text datasets directly within the database. The driver supports standard SQL query execution for both file-based and in-memory storage. It includes capabilities for
Allows the registration of Go functions as custom SQL scalar or aggregate functions via C-to-Go callbacks.
GRDB.swift is a comprehensive SQLite toolkit and object-relational mapper for Swift. It provides a database wrapper that handles local data persistence, connection management, and encrypted file storage for Apple platforms. The library features a dedicated observation framework that tracks database changes to automatically synchronize the application state and user interface in real time. It distinguishes itself with a type-safe query builder and a protocol-based mapping system that converts database rows into structured Swift objects. The toolkit covers a broad range of administrative and o
Supports registering custom Swift logic as SQL functions to extend the database's query capabilities.
AlaSQL is a JavaScript SQL database engine that allows for the filtering, grouping, and joining of in-memory object arrays and JSON data. It functions as an in-memory SQL database and client-side data processor, enabling the execution of SQL statements against JavaScript arrays and external data sources in both browser and server environments. The project serves as a universal data query tool capable of performing relational joins across diverse sources, such as merging Google Spreadsheets, SQLite files, and remote APIs into a single result set. It also acts as an IndexedDB SQL wrapper, allow
Enables the definition of custom scalar functions via JavaScript to perform specialized calculations in queries.
ToyDB is a distributed SQL database that provides a system for storing and querying data across multiple nodes. It focuses on maintaining strong consistency and fault tolerance through the implementation of a distributed consensus algorithm. The project distinguishes itself by supporting historical data versioning, enabling time-travel queries to retrieve the state of the database from a specific point in the past. It utilizes multi-version concurrency control to manage ACID transactions and ensure data integrity during concurrent operations. The system covers relational data modeling with t
Provides standard SQL aggregate functions for calculating sums, counts, and averages over datasets.
SQLiteStudio is an open-source graphical tool for browsing, editing, and managing SQLite database files. It combines a full-featured SQL editor with syntax highlighting, a visual database schema designer for creating entity-relationship diagrams, and a plugin-based extensibility platform that allows adding custom functionality through C/C++, JavaScript, Tcl, or Python. The application distinguishes itself through its multi-language scripting engine, which embeds JavaScript, Tcl, and Python interpreters to enable user-defined functions and scripts within SQL queries. It supports encrypted data
Adds user-defined functions written in C/C++ that can be called from SQL queries.
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Allows configuration of aggregation functions to compute results directly within the index.
Apache Hive is a SQL-on-Hadoop data warehouse that enables querying and managing petabytes of data stored in distributed storage such as HDFS and cloud storage services. It provides a familiar SQL interface for batch analytics and reporting, supported by a core set of components including the HiveServer2 Thrift service for remote query execution, the Hive Metastore Service for central metadata management, the Hive ACID Transaction Engine for concurrent read-write operations, and the Hive LLAP Interactive Engine for low-latency analytical processing. The WebHCat REST API offers an HTTP interfac
Adds user-defined functions, aggregates, and table functions to SQL for custom data processing.
GreptimeDB is a distributed, open-source time-series database built for unified observability. It stores and queries metrics, logs, and traces together in a single columnar engine, supporting both SQL and PromQL for analysis. The database is designed as a Kubernetes-native operator with a decoupled compute and storage architecture, enabling horizontal scaling and multi-region deployment. What distinguishes GreptimeDB is its role as a multi-protocol ingestion gateway, accepting data through OpenTelemetry, Prometheus Remote Write, InfluxDB, Loki, Elasticsearch, Kafka, and MQTT protocols without
Processes streaming data with continuous aggregation flows to produce downsampled results in real time.
Perfetto is a platform for system-level performance tracing and analysis on Linux and Android. It combines a high-throughput trace recorder, a SQL-based query engine, and a browser-based visualizer into a single toolchain. The platform covers CPU scheduling and call-stack profiling, native and Java heap memory allocation tracking, GPU and graphics events, and system-wide counters such as CPU frequency and power consumption. The architecture decouples trace recording from offline analysis, using a compact protobuf format for event encoding and columnar storage for efficient SQL queries. The we
Creates scalar or table-valued functions using a SQL SELECT statement for custom analysis logic.
هذا المشروع عبارة عن ورقة غش لقواعد البيانات العلائقية ومرجع لـ SQL. يوفر مجموعة من أمثلة البنية وتوثيق الاستعلامات لإدارة قواعد البيانات العلائقية باستخدام لغة الاستعلام الهيكلية. تم تنفيذ الأداة كموقع ثابت مع توثيق قابل للبحث من جانب العميل، مما يسمح بالتصفية الفورية للمحتوى التقني من خلال فهرس يعتمد على المتصفح. يغطي المرجع إدارة قواعد البيانات العلائقية، بما في ذلك استرجاع البيانات، وإدارة مخطط قاعدة البيانات، وصيانة السجلات. كما يتضمن توجيهاً حول معالجة البيانات العلائقية من خلال ربط الجداول وتوليد التقارير الإجمالية.
Documents the use of SQL aggregate functions to generate data summaries and reports.
Readyset is a transparent caching proxy for PostgreSQL and MySQL that sits between an application and its database, intercepting SQL queries and serving cached results from memory. It automatically caches query results on first execution and keeps those caches consistent by consuming the database’s replication stream in real time, enabling faster repeated reads without application code changes. The proxy also supports caching advanced SQL functions such as window functions, bucket functions, and locale-aware collation sorting, and exposes an interface that allows AI agents to inspect proxied q
Supports caching window functions, bucket functions, and locale-aware collation sorting for complex analytical queries.
Arroyo is a high-performance stream processing platform built in Rust. It executes continuous SQL queries on streaming data with event-time semantics, enabling accurate windowed aggregations, joins, and stateful computations on unbounded event streams. The platform uses native Rust execution for high throughput and low latency, with periodic checkpointing for exactly-once fault tolerance and horizontal scaling across distributed workers. The system integrates deeply with Kafka for reading and writing topics with exactly-once delivery and supports change data capture (CDC) from MySQL and Postg
Defines custom SQL functions in Rust or Python for use in streaming data pipelines.
H2 is a JDBC-compliant relational database management system written in Java. It functions as an embeddable SQL database that can run directly within an application process to remove network latency, or as an in-memory database for high-performance volatile storage. It also includes a web-based console for executing SQL commands and administering schemas. The system is characterized by its flexible deployment modes, including a standalone server mode for remote TCP/IP access and a mixed mode for simultaneous local and remote connectivity. It features a dialect emulation layer and compatibilit
Supports the creation of user-defined aggregate functions (UDAFs) by mapping them to source code.
sqlean هي مجموعة من مكتبات ملحقات SQLite المنفذة كمكتبات مشتركة قائمة على C. توفر مجموعة من الوظائف العددية والقيمية الجدولية الإضافية التي توسع القدرات الأصلية لمحرك قاعدة بيانات SQLite. يوفر المشروع مجموعات أدوات متخصصة للتشفير، والرياضيات المتقدمة، والشبكات، والوصول إلى نظام الملفات. تشمل هذه التجزئة الثنائية والترميز، والتحليل الإحصائي، والتحقق من عنوان IP، والقدرة على تعيين ملفات CSV أو مسارات نظام الملفات كجداول افتراضية. تتضمن المكتبة أيضاً أدوات شاملة لمعالجة النصوص مثل التعبيرات النمطية، والمطابقة الضبابية، ومعالجة السلاسل الواعية بـ Unicode. تغطي القدرات الإضافية إدارة التاريخ والوقت عالية الدقة وتوليد المعرفات الفريدة.
Enables the creation of custom scalar user-defined functions to encapsulate reusable single-value logic.
Velox هو محرك تنفيذ استعلامات عالي الأداء ومكتبة لمعالجة البيانات العمودية بلغة C++. يعمل كإطار عمل قابل للتركيب لتنفيذ محركات الاستعلام التحليلية، ويوفر مقيماً للتعبيرات المتجهة (vectorized) ومجموعة أدوات لأنظمة إدارة البيانات. يتميز المشروع باستخدامه للتنفيذ العمودي المتجه وتخصيص الذاكرة القائم على الساحة (arena-based) لمعالجة مجموعات البيانات واسعة النطاق. يتميز بتحسينات متخصصة مثل التخزين المؤقت لجدول الربط الإذاعي (broadcast join)، ودفع الفلتر الديناميكي للأسفل، وترميز القاموس لتقليل حمل الذاكرة وتسريع القراءات التحليلية. يغطي المحرك مجموعة واسعة من القدرات التحليلية، بما في ذلك تنفيذ عمليات الربط (hash, merge, semi joins)، بالإضافة إلى التجميع المتوازي متعدد المراحل وحساب دوال النافذة. يوفر بدائيات للتخزين العمودي في الذاكرة، وفك تشفير بيانات Parquet، والتكامل مع التخزين السحابي. يتم توفير القابلية للتوسع من خلال نظام تسجيل الدوال للدوال العددية والتجميعية المخصصة، مع توفر روابط عالية المستوى لربط منطق C++ بلغة Python.
Allows definition of new aggregation logic using vector interfaces and registration with specific type signatures.
Rusqlite is an embedded database interface and relational database driver that provides a client library for interacting with SQLite. It functions as an SQL query wrapper, enabling the management of local file-based or in-memory databases through a safe interface. The library allows for the extension of native database capabilities by implementing custom scalar functions, collations, and virtual tables. It also supports the embedding of the database engine directly into the application binary to remove external library dependencies. The project covers a broad range of capabilities including
Extends SQLite functionality by implementing custom scalar functions, collations, and virtual tables using Rust logic.
Drift is a type-safe SQL persistence library and relational mapper that provides a structured way to map database tables to classes and execute SQL queries with build-time validation. It functions as a type-safe query builder and a wrapper for SQLite and PostgreSQL, eliminating manual result set parsing by binding query outputs to native objects. The project distinguishes itself through a build-time code generation system that produces type-safe APIs and validates raw SQL statements against database versions before execution. It features reactive query streaming, which transforms SQL queries
Computes summary values like sums and counts using SQL grouping and window functions.