15 Repos
Utilities for retrieving typed data from specific database columns by index or name.
Distinct from Value Extraction: Existing candidates focus on lineage or schema definitions, not the runtime extraction of values from result sets.
Explore 15 awesome GitHub repositories matching data & databases · Column Value Extraction. Refine with filters or upvote what's useful.
fmdb is an object-oriented SQLite database library and persistence layer for native macOS and iOS environments. It provides an Objective-C wrapper that encapsulates the low-level C API, allowing applications to manage local relational data storage and embedded database connections through a high-level interface. The library focuses on thread-safe database access by synchronizing operations across multiple threads using serialized queues to prevent data corruption and race conditions. It includes specialized capabilities for secure local storage, such as database encryption and the management
Retrieves data from specific columns by index or name as strings, integers, or binary data.
LanceDB is a vector database and columnar data store designed to function as a versioned dataset manager and vector search engine. It serves as a high-performance backend for indexing and retrieving high-dimensional embeddings, providing the foundation for machine learning data pipelines. The system distinguishes itself through a combination of cloud-native object storage and immutable version tracking, allowing for data time-travel and reproducible AI experiments. It integrates hybrid search capabilities, merging dense vector similarity with BM25 full-text search and SQL-like scalar filters
Creates new data columns by transforming existing values through SQL expressions or external data merges.
AllAboutBugBounty is a curated collection of bug bounty techniques and payloads for web application security testing. It serves as a reference resource covering common web vulnerabilities and exploitation methods for security researchers, providing a structured approach to identifying and exploiting web application security flaws in bug bounty programs. The repository covers a wide range of attack categories including authentication bypass, cross-site scripting injection, server-side request forgery, web cache poisoning, and business logic abuse. It includes techniques for bypassing access co
Documents enumerating database schemas through injection techniques for targeted exploitation.
collect.js is a dependency-free JavaScript library that provides a fluent, chainable interface for manipulating arrays and objects. It mirrors the Laravel Collection API, offering a consistent set of methods for data transformation across JavaScript and Laravel backend environments. The library stores collection data as plain arrays internally and supports fluent method chaining, where each method returns a new collection instance. The library distinguishes itself by closely replicating the Laravel Collection API in JavaScript, mapping each PHP method to an equivalent JavaScript implementatio
Calculates sum, average, median, mode, min, or max across all items or a specified key.
Ibis is a portable Python dataframe library and multi-backend query engine that provides a unified interface for executing data transformations across diverse compute engines. It functions as a Python SQL expression compiler and dialect transpiler, allowing users to define data logic once and execute it across cloud warehouses, embedded databases, and distributed clusters without rewriting code. The project distinguishes itself through a database backend abstraction that decouples transformation logic from the underlying execution engine. It enables polyglot data workflows by mixing raw SQL s
Computes summary statistics like mean, max, min, and sum across columns or groups.
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Captures record headers, keys, and timestamps as queryable columns within the destination table for deeper analysis.
CodeIgniter is a PHP web framework built on the Model-View-Controller pattern, designed for building full-stack web applications. It provides a lightweight toolkit with minimal configuration, organizing application logic into controllers, models, and views for clean separation of concerns. The framework includes a fluent query builder for constructing SQL statements programmatically, PSR-4 autoloading with namespace mapping, and a service-based dependency injection container for managing shared class instances. The framework distinguishes itself through its comprehensive set of built-in tools
Returns an indexed array of values from a single specified column across matching rows.
Computes and displays summary statistics like sum, average, min, median, or max from a query column.
Daft is a distributed dataframe library and multimodal data processor designed to handle large-scale structured and unstructured data. It functions as a vectorized execution engine that processes tables alongside images, audio, and video, utilizing a unified schema to manage diverse data types. The project distinguishes itself by combining distributed data engineering with large-scale AI inference. It provides an AI data pipeline for batch-optimizing model prompts and generating high-dimensional text embeddings, while utilizing zero-copy memory sharing to execute custom Python functions witho
Calculates summary statistics like sums and averages across multiple columns for a single row.
dtale is a web-based interactive grid and visualizer for pandas dataframes, designed as an exploratory data analysis tool. It provides a browser-based interface for analyzing tabular data structures, allowing users to calculate statistics, detect outliers, and compute correlations without writing manual code. The project functions as an embedded data viewer that can be integrated into web applications via iframes or custom routes, with specific support for Django, Flask, and Streamlit. It enables the exploration of datasets through a combination of an interactive data grid and a data visualiz
Generates box plots, histograms, and value counts to describe the distribution of data columns.
Goravel ist ein voll ausgestattetes Entwicklungs-Scaffold und Framework für die Erstellung von Webanwendungen, REST-APIs und gRPC-Diensten mit der Programmiersprache Go. Es implementiert eine Model-View-Controller-Architektur und bietet ein umfassendes Toolkit für leistungsstarke RPC-Server und -Clients. Das Framework zeichnet sich durch sein umfangreiches integriertes Ökosystem aus, das einen flüssigen Object-Relational-Mapper (ORM) für die Datenbankverwaltung und ein dediziertes CLI-Toolkit für administrative Automatisierung und Projekt-Scaffolding umfasst. Es verfügt über eine treiberbasierte Dienstabstraktion, die es Entwicklern ermöglicht, Speicher-, Cache- und Session-Backends auszutauschen, ohne die Anwendungslogik zu ändern. Die Plattform deckt ein breites Spektrum an Anwendungsfunktionen ab, darunter asynchrone Aufgabenverarbeitung mit verteilten Queues, sicheres Identitätsmanagement mittels tokenbasierter Authentifizierung sowie eine robuste Sicherheitsschicht mit Verschlüsselung und Zugriffskontrolle. Sie bietet zudem Tools für Content-Lokalisierung, Template-Rendering und eine automatisierte Testinfrastruktur mit Dependency-Mocking.
Provides utilities to extract specific database column values into Go slices.
H2 ist ein JDBC-konformes relationales Datenbankmanagementsystem, das in Java geschrieben ist. Es fungiert als einbettbare SQL-Datenbank, die direkt innerhalb eines Anwendungsprozesses ausgeführt werden kann, um Netzwerklatenz zu eliminieren, oder als In-Memory-Datenbank für performante, flüchtige Speicherung. Es enthält zudem eine webbasierte Konsole zur Ausführung von SQL-Befehlen und zur Verwaltung von Schemata. Das System zeichnet sich durch flexible Bereitstellungsmodi aus, einschließlich eines Standalone-Server-Modus für Remote-TCP/IP-Zugriffe und eines gemischten Modus für gleichzeitige lokale und Remote-Konnektivität. Es verfügt über eine Dialekt-Emulationsschicht und Kompatibilitätsmodi, die es ermöglichen, das Verhalten und die Syntax anderer Datenbanksysteme nachzuahmen. Die Engine bietet ein breites Spektrum an Funktionen, darunter ACID-Transaktionen mit Multi-Version Concurrency Control (MVCC), Unterstützung für Geodaten und JSON sowie fortgeschrittene analytische Fensterfunktionen. Es enthält Tools zur Datensicherung durch komprimierte Backups, SQL-Skript-Wiederherstellung und Off-Heap-Speicherverwaltung für große Datensätze. Die Datenbank lässt sich über Standard-JDBC-Treiber und Verbindungs-URLs in Anwendungen integrieren.
Gathers values from multiple rows into a single array with optional ordering during aggregation.
This PHP data collection library is a functional data wrapper and array manipulation framework. It converts arrays, JSON strings, and iterables into chainable collection objects designed for advanced filtering, sorting, and transformation. The library is distinguished by its ability to dynamically extend functionality through the registration of custom methods via closures. It also provides specialized capabilities for hierarchical data modeling, allowing flat datasets with parent-child identifiers to be reconstructed into nested tree structures. The toolkit covers a broad surface of data ma
Computes sum, average, min, max, and frequency counts on collection values.
Xan is a command-line tool and data transformation engine for processing CSV, TSV, and JSONL datasets. It functions as a processor for compressed files, enabling random access and seeking within gzipped and Zstd files, and serves as a converter for specialized bioinformatics data formats. The tool handles large datasets without requiring full memory loads by utilizing stream-based processing. It provides capabilities for merging, sorting, and deduplicating massive files, as well as converting data between various tabular formats. The project covers a broad range of data wrangling and analysi
Loads only requested data columns into memory to reduce the resource footprint when processing wide datasets.
DataFrame is a C++ tabular data library and manipulation engine designed for managing heterogeneous data in contiguous memory. It functions as a statistical analysis framework and time series analysis toolkit, providing the means to store, index, and transform multidimensional datasets. The project distinguishes itself through a high-performance execution model that utilizes column-major storage, SIMD-aligned memory allocation, and a thread-pool for parallel computations. It employs a visitor-based algorithm dispatch system and policy-driven transformations to decouple data processing logic f
Provides column-based aggregation to compute total sums while optionally ignoring missing data.