9 dépôts
Functions for manipulating nested array structures within tabular data.
Distinguishing note: Focuses on fixed-size array column manipulation.
Explore 9 awesome GitHub repositories matching data & databases · Array Column Operations. Refine with filters or upvote what's useful.
Polars is a high-performance columnar data processing library designed for efficient analytical workflows. It functions as a structured data library that organizes information into typed columns, utilizing the Apache Arrow memory format to enable zero-copy data sharing and cache-friendly, vectorized operations. The engine is built to handle large-scale tabular datasets, providing both local and distributed analytical runtimes that scale from single-machine environments to multi-node clusters. The project distinguishes itself through a sophisticated lazy query engine that constructs abstract e
Performs operations on fixed-size array columns using specialized functions for sorting and aggregating elements.
This is an educational tutorial that walks through implementing a complete JSON library from scratch in C. The project covers the full data lifecycle of JSON, including parsing text into structured in-memory representations, validating input against the specification, serializing data back into standard JSON output, and providing structured access to elements within parsed arrays and objects. The implementation is built around a hand-written recursive descent parser that processes JSON text by matching grammar rules to build a structured data tree. Parsed values are stored in a tagged union r
Constructs nested arrays and objects by pushing and popping elements on a dynamic stack during parsing.
Lance is a versioned columnar data format and storage engine designed as a multimodal AI lakehouse. It serves as a vector database storage engine and a cloud object store dataset manager, organizing images, video, audio, and embeddings into a unified format optimized for machine learning workflows. The project distinguishes itself by combining a columnar layout for structured data with a specialized blob store for large multimodal tensors. It implements a hybrid search engine that integrates vector similarity search, full-text search, and SQL analytics on a single dataset, supported by a stor
Extracts values, checks existence, or measures array lengths within JSON columns using JSONPath syntax.
Ibis is a portable Python dataframe library and multi-backend query engine that provides a unified interface for executing data transformations across diverse compute engines. It functions as a Python SQL expression compiler and dialect transpiler, allowing users to define data logic once and execute it across cloud warehouses, embedded databases, and distributed clusters without rewriting code. The project distinguishes itself through a database backend abstraction that decouples transformation logic from the underlying execution engine. It enables polyglot data workflows by mixing raw SQL s
Identifies common elements between two array columns to determine shared values.
Just est une collection de bibliothèques utilitaires JavaScript conçues pour la manipulation de données, la programmation fonctionnelle, l'optimisation des performances, l'analyse statistique et le traitement de chaînes. Il fournit un ensemble d'outils pour le clonage profond, le filtrage et la transformation d'objets et tableaux complexes. Le projet est structuré comme une série de modules sans dépendance, permettant aux utilitaires d'être utilisés indépendamment pour minimiser la taille du bundle. Il implémente des modèles de programmation fonctionnelle incluant le currying, le piping et l'application partielle, et fournit un contrôle d'exécution via la mémoïsation, le debouncing et le throttling. La bibliothèque couvre un large éventail de capacités, incluant la manipulation profonde d'objets, la génération de données combinatoires et des opérations mathématiques telles que la vérification de nombres premiers et le clamping numérique. Elle inclut également des outils statistiques pour calculer des métriques comme la variance et l'écart-type, ainsi que des utilitaires de traitement de texte pour la conversion de casse et l'interpolation de chaînes.
Returns a new array containing only the elements common to two source arrays.
Carp is a statically typed Lisp compiler that compiles Lisp-like syntax directly to C source code, enabling seamless integration with existing C libraries and low-level system programming. It manages memory deterministically at compile time using ownership tracking and linear types, eliminating garbage collection pauses and runtime overhead while ensuring type safety through an inferred static type system. The language distinguishes itself through compile-time macro expansion and metaprogramming capabilities, allowing code generation and transformation before final binary output. It enforces
Stores fixed-size collections of elements allocated on the stack.
jsondiffpatch est une bibliothèque de diff et de patch JSON conçue pour calculer les différences entre deux objets JSON et appliquer ces changements pour synchroniser les états. Elle fonctionne comme un outil de synchronisation pour calculer des deltas et appliquer des patchs afin de mettre à jour ou de rétablir des objets JavaScript imbriqués complexes. Le projet fournit une implémentation conforme à la norme RFC 6902 JSON Patch pour les mises à jour atomiques et un moteur de rendu de différence visuelle qui convertit les deltas de données en vues HTML lisibles par l'homme. Il inclut un outil de diff de texte spécialisé pour effectuer une analyse fine, au niveau du caractère, sur de longues chaînes au sein de valeurs de données JSON. La bibliothèque couvre un large éventail de capacités, notamment la génération de delta récursive et le calcul de différence de tableau en utilisant des algorithmes de plus longue sous-séquence commune (LCS). Elle prend en charge le diff logique de tableau pour détecter les déplacements d'éléments via un hachage personnalisé et offre des options de sortie multi-format telles qu'un formatage console codé par couleur et un composant React dédié pour les comparaisons visuelles.
Identifies moved, added, or deleted items within JSON arrays using custom hashing instead of simple index matching.
Arroyo is a high-performance stream processing platform built in Rust. It executes continuous SQL queries on streaming data with event-time semantics, enabling accurate windowed aggregations, joins, and stateful computations on unbounded event streams. The platform uses native Rust execution for high throughput and low latency, with periodic checkpointing for exactly-once fault tolerance and horizontal scaling across distributed workers. The system integrates deeply with Kafka for reading and writing topics with exactly-once delivery and supports change data capture (CDC) from MySQL and Postg
Provides built-in SQL functions for JSONPath extraction and array transformation on streaming data.
Rueidis is a high-performance Redis client library for Go that provides a type-safe and asynchronous interface for interacting with Redis servers. It includes a full implementation of the Redis serialization protocol and a dedicated connection manager to handle pooling, multiplexing, and automatic pipelining. The library is distinguished by its support for RDMA connectivity to reduce latency and CPU overhead. It features a distributed lock manager that implements majority-based locking and optimistic concurrency control, as well as client-side caching with invalidation signals to minimize net
Includes a utility that scans JSON array results directly into slices of structured types.