Why is pola-rs/polars a recommended Nested Data Manipulations GitHub Repositories repository?

Processes list-type data columns using dedicated functions for aggregation and element-wise computation.

Why is facebook/immutable-js a recommended Nested Data Manipulations GitHub Repositories repository?

Offers functions for processing and updating complex, hierarchical data structures using path-based accessors.

Why is scrapinghub/portia a recommended Nested Data Manipulations GitHub Repositories repository?

Captures complex hierarchical data structures from web pages by nesting extracted items.

Why is databendlabs/databend a recommended Nested Data Manipulations GitHub Repositories repository?

Flattens nested arrays or objects into individual rows to facilitate analysis of complex data structures.

Why is saulpw/visidata a recommended Nested Data Manipulations GitHub Repositories repository?

Flattens complex hierarchical data like JSON arrays into multiple top-level columns.

Why is jackzhenguo/python-small-examples a recommended Nested Data Manipulations GitHub Repositories repository?

Provides a generator that recursively converts multi-level nested lists into a single flat list.

Why is autoscrape-labs/pydoll a recommended Nested Data Manipulations GitHub Repositories repository?

Resolves hierarchical data by defining scope elements to extract nested sub-models from the DOM.

Why is vandadnp/flutter-tips-and-tricks a recommended Nested Data Manipulations GitHub Repositories repository?

Provides utilities to flatten lists containing sublists into a single linear sequence.

Why is robinhood/faust a recommended Nested Data Manipulations GitHub Repositories repository?

Maintains complex types like lists and sets within tables by utilizing a transaction log for the changelog.

19 repository-uri

Awesome GitHub RepositoriesNested Data Manipulations

Functions for processing complex, hierarchical data structures within columnar formats.

Distinguishing note: Specifically targets list-type columns rather than flat data structures.

Explore 19 awesome GitHub repositories matching data & databases · Nested Data Manipulations. Refine with filters or upvote what's useful.

Găsește cele mai bune repo-uri cu AI.Vom căuta cele mai potrivite repository-uri folosind AI.

sindresorhus/awesome
sindresorhus/awesome
476,211Vezi pe GitHub
Acest proiect este un director întreținut de comunitate care servește drept index cuprinzător de instrumente software, framework-uri și materiale educaționale. Funcționează ca o bază de cunoștințe open-source, organizând diverse domenii de inginerie și resurse tehnice într-o taxonomie structurată pentru a ajuta dezvoltatorii să descopere conținut de înaltă calitate. Directorul se distinge printr-un model de peer-review descentralizat, unde contribuitori independenți curatoriază, verifică și actualizează intrările pentru a asigura acuratețea și relevanța. Toate informațiile sunt stocate într-un format markdown de tip flat-file, controlat prin versiuni, ceea ce asigură independența față de platformă, transparența și auditabilitatea întregii colecții. Proiectul acoperă o suprafață vastă de capabilități, incluzând descoperirea resurselor tehnice, avansarea în cariera profesională și gestionarea cunoștințelor de dezvoltare software. Oferă acces la căi de învățare structurate, instrumente de infrastructură și securitate, utilitare de gestionare a datelor și resurse specializate pentru domenii variind de la sănătate la științe umaniste digitale. Repository-ul este menținut ca o colecție publică, controlată prin versiuni, permițând accesul programatic și actualizări bazate pe comunitate pentru datele sale structurate.
Offers functions for processing complex, hierarchical data structures.
awesomeawesome-listlists
Vezi pe GitHub476,211
pola-rs/polars
pola-rs/polars
38,855Vezi pe GitHub
Polars is a high-performance columnar data processing library designed for efficient analytical workflows. It functions as a structured data library that organizes information into typed columns, utilizing the Apache Arrow memory format to enable zero-copy data sharing and cache-friendly, vectorized operations. The engine is built to handle large-scale tabular datasets, providing both local and distributed analytical runtimes that scale from single-machine environments to multi-node clusters. The project distinguishes itself through a sophisticated lazy query engine that constructs abstract e
Processes list-type data columns using dedicated functions for aggregation and element-wise computation.
Rustarrowdataframedataframe-library
Vezi pe GitHub38,855
facebook/immutable-js
facebook/immutable-js
33,060Vezi pe GitHub
This is a persistent data structure library for JavaScript that provides collections which prevent the direct mutation of objects and arrays. It serves as an immutable state management tool and functional programming utility, ensuring that data remains unchanged after creation to simplify change detection and state tracking. The library enables the maintenance of application state by producing new versions of data structures during updates. It focuses on efficient data comparison by checking actual content instead of memory references and supports a functional programming workflow to prevent
Offers functions for processing and updating complex, hierarchical data structures using path-based accessors.
TypeScript
Vezi pe GitHub33,060
scrapinghub/portia
scrapinghub/portia
9,509Vezi pe GitHub
Portia is a containerized scraping platform and visual web scraper that enables no-code data extraction. It serves as a Scrapy visual scraping tool and spider generator, allowing users to design and deploy web scrapers through a graphical interface instead of writing manual selector code. The system distinguishes itself by converting visual web page annotations into executable Scrapy spider code and structured JSON specifications. This visual-to-code mapping allows users to define scraping logic and extraction rules through a point-and-click interface, which can then be exported for use in ex
Captures complex hierarchical data structures from web pages by nesting extracted items.
Python
Vezi pe GitHub9,509
databendlabs/databend
databendlabs/databend
9,351Vezi pe GitHub
Databend is a cloud-native data warehouse and OLAP database designed for large-scale analytics. It functions as a SQL-compliant engine and serverless analytics platform that separates compute from storage to allow for independent scaling. The system integrates vector database capabilities, indexing high-dimensional embeddings to enable semantic, hybrid, and full-text searches across massive datasets. It further distinguishes itself through serverless compute management that automatically scales resources based on demand and shuts them down during idle periods. The platform covers a broad set
Flattens nested arrays or objects into individual rows to facilitate analysis of complex data structures.
Rustaibigdatacloud-native
Vezi pe GitHub9,351
saulpw/visidata
saulpw/visidata
8,834Vezi pe GitHub
VisiData is a terminal-based interactive data analysis tool and browser designed for exploring, filtering, and sorting large tabular datasets. It functions as a structured data inspector that loads and flattens complex formats like JSON, XML, and PCAP into interactive sheets, as well as a terminal file manager for navigating directories and performing staged filesystem operations. The project distinguishes itself by rendering data visualizations, such as scatter plots and histograms, directly in the terminal using Unicode Braille characters. It provides a Python-based data wrangling environme
Flattens complex hierarchical data like JSON arrays into multiple top-level columns.
Pythonclicsvdatajournalism
Vezi pe GitHub8,834
jackzhenguo/python-small-examples
jackzhenguo/python-small-examples
8,132Vezi pe GitHub
This project is a comprehensive library of practical Python code examples and patterns. It provides a collection of scripts and snippets designed to demonstrate a wide range of programming tasks, from basic syntax to advanced implementation patterns. The repository focuses on several core domains, including the implementation of concurrency and multithreading examples, data analysis snippets for cleaning and manipulating tabular data, and various data visualization examples. It also covers automation scripts for file system management and a variety of general programming patterns. Additional
Provides a generator that recursively converts multi-level nested lists into a single flat list.
Pythondata-sciencemachine-learningpython
Vezi pe GitHub8,132
autoscrape-labs/pydoll
autoscrape-labs/pydoll
6,919Vezi pe GitHub
pydoll is a Chrome DevTools Protocol automation library and headless browser controller used for web data extraction and parallel browser automation. It controls Chromium-based browsers via direct WebSocket connections, allowing it to manage isolated browser contexts and tabs while bypassing the overhead and detection associated with WebDriver. The project features an anti-bot evasion framework that mimics natural human behavior, including mouse movements generated via Bezier curves and variable typing patterns. It provides specialized stealth capabilities to bypass behavioral analysis and au
Resolves hierarchical data by defining scope elements to extract nested sub-models from the DOM.
Pythonanti-detectionautomationbrowser-automation
Vezi pe GitHub6,919
vandadnp/flutter-tips-and-tricks
vandadnp/flutter-tips-and-tricks
6,822Vezi pe GitHub
This repository is a collection of practical code snippets and implementation patterns for Flutter and Dart. It serves as a comprehensive guide and reference for asynchronous programming, state management patterns, and UI component design. The project provides advanced language reference material covering generics, reflection, factory constructors, and null-aware operators. It also includes specific utilities for manipulating Dart collections, such as helper methods for transforming and filtering maps, lists, and iterables. The coverage extends to high-level capabilities including asynchrono
Provides utilities to flatten lists containing sublists into a single linear sequence.
Dartdartflutterflutter-ui
Vezi pe GitHub6,822
robinhood/faust
robinhood/faust
6,822Vezi pe GitHub
Faust is a Python library for building distributed stream processing applications that integrate with Kafka. It functions as an asynchronous stream processor designed to handle high-throughput event streams and real-time data analysis using asynchronous functions. The system operates as a distributed stream processor and state store, utilizing sharding and partitioned topics to scale processing workloads horizontally across multiple worker nodes. It maintains state through a replicated key-value storage system backed by local databases to ensure high availability and fast recovery. The frame
Maintains complex types like lists and sets within tables by utilizing a transaction log for the changelog.
Python
Vezi pe GitHub6,822
ecrmnn/collect.js
ecrmnn/collect.js
6,571Vezi pe GitHub
collect.js is a dependency-free JavaScript library that provides a fluent, chainable interface for manipulating arrays and objects. It mirrors the Laravel Collection API, offering a consistent set of methods for data transformation across JavaScript and Laravel backend environments. The library stores collection data as plain arrays internally and supports fluent method chaining, where each method returns a new collection instance. The library distinguishes itself by closely replicating the Laravel Collection API in JavaScript, mapping each PHP method to an equivalent JavaScript implementatio
Recursively reduces a multi-dimensional array or object into a single-level collection.
JavaScriptcollectionlaravellaravel-collections
Vezi pe GitHub6,571
sebastianbergmann/object-enumerator
sebastianbergmann/object-enumerator
6,537Vezi pe GitHub
Object-enumerator is a data structure crawler and enumeration library designed to discover and list all objects stored within deep or circular data references. It functions as a traversal tool that recursively walks through nested arrays and object graphs to identify every individual referenced object. The library flattens complex hierarchical data structures into a linear collection of unique objects. This process enables data structure analysis and memory reference mapping by tracing all objects connected to a root element to understand the overall composition of a data set.
Flattens complex object hierarchies into a comprehensive, linear list of all contained individual objects.
PHP
Vezi pe GitHub6,537
wisdompeak/leetcode
wisdompeak/LeetCode
6,186Vezi pe GitHub
Acest proiect este o bibliotecă curatoriată de implementări de algoritmi și probleme de programare rezolvate. Servește ca depozit de referință pentru programarea competitivă și implementările de structuri de date, oferind soluții optimizate pentru o gamă largă de provocări de codare. Colecția organizează exemplele de cod pe tehnică algoritmică, concentrându-se în mod specific pe implementarea arborilor, grafurilor și heap-urilor pentru a optimiza complexitatea timpului și a spațiului. Oferă soluții specifice limbajului utilizate pentru sarcini de codare de înaltă performanță. Depozitul acoperă un set larg de capabilități, inclusiv traversări de grafuri, programare dinamică, procesarea modelelor de șiruri și operațiuni de căutare binară. Include, de asemenea, implementări pentru interogarea datelor pe intervale, manipularea biților și proiectarea structurilor de date personalizate, cum ar fi cache-urile și motoarele de autocompletare. Acoperirea suplimentară include calcule matematice și urmărirea performanței în concursuri.
Provides recursive logic to convert hierarchical nested list structures into linear integer sequences.
C++
Vezi pe GitHub6,186
apache/pinot
apache/pinot
6,098Vezi pe GitHub
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Transforms hierarchical or array-based data into individual rows during ingestion to enable granular analysis of nested collections and multi-level arrays.
Java
Vezi pe GitHub6,098
pest-parser/pest
pest-parser/pest
5,355Vezi pe GitHub
Pest este o bibliotecă de parsare pentru Rust și un generator automat de parsere care transformă definițiile de gramatică formală în parsere funcționale. Se specializează în Parsing Expression Grammar (PEG) pentru a recunoaște și structura tipare complexe de text, oferind un sistem pentru parsarea gramaticilor libere de context. Biblioteca implementează tokenizarea de tip zero-copy și compilarea statică a gramaticii pentru a reduce overhead-ul la runtime. Suportă compatibilitatea no-std, permițând compilarea parserului pentru medii embedded sau bare-metal unde biblioteca standard nu este disponibilă. Proiectul acoperă o gamă de capabilități de parsare, inclusiv extragerea perechilor de token-uri imbricate și validarea sintactică automatizată. Este utilizat pentru implementarea de limbaje specifice domeniului (DSL), parsarea de limbaje personalizate și evaluarea expresiilor matematice. De asemenea, oferă raportarea automată a erorilor pentru a identifica token-uri neașteptate sau input-uri lipsă.
Provides a hierarchical tree of rule names and byte offsets for efficient traversal of parsed input.
Rust
Vezi pe GitHub5,355
tidyverse/dplyr
tidyverse/dplyr
5,034Vezi pe GitHub
dplyr este o bibliotecă R pentru manipularea datelor care oferă o gramatică pentru transformarea cadrelor de date (data frames) tabelare. Funcționează ca un procesor de data frames în memorie și un instrument de algebră relațională, folosind un set consistent de verbe pentru a filtra, selecta și sumariza datele. Proiectul include un motor de traducere SQL care convertește expresiile de manipulare a datelor de nivel înalt în interogări optimizate. Acest lucru permite utilizatorilor să efectueze transformări direct pe baze de date relaționale la distanță și în stocarea cloud, fără a descărca datele local. Biblioteca acoperă o gamă largă de operațiuni tabelare, inclusiv mutarea coloanelor, subsetarea rândurilor și join-uri de date relaționale. De asemenea, oferă capabilități pentru analiza datelor grupate, permițând partiționarea seturilor de date pentru agregări și rezumate independente.
Collapses rows into list-columns based on grouping variables to enable hierarchical analysis.
R
Vezi pe GitHub5,034
awangdev/leet-code
awangdev/leet-code
4,344Vezi pe GitHub
Acest proiect este o bibliotecă de referință curatoriată de modele algoritmice, implementări de structuri de date și note de design de sistem. Servește ca un set de probleme algoritmice Java și un ghid de programare competitivă, oferind o colecție de soluții pentru provocările de codare de pe platforme precum LeetCode și LintCode. Biblioteca se distinge prin setul său cuprinzător de implementări Java pentru structuri de date avansate și strategii algoritmice. Include referințe detaliate pentru rezolvarea problemelor complexe cu analiza complexității timpului și spațiului aferente. Proiectul acoperă o suprafață largă a fundamentelor informaticii, inclusiv designul algoritmilor, implementarea structurilor de date și designul sistemelor. Conținutul său cuprinde teoria grafurilor, programarea dinamică, căutarea și optimizarea și tehnici de procesare liniară a datelor. Include, de asemenea, note despre scalabilitatea infrastructurii, caching-ul performanței și modelele de arhitectură software.
Provides recursive methods to convert multi-level nested list structures into linear sequences.
Javaalgorithmdynamicprogrammingjava
Vezi pe GitHub4,344
praw-dev/praw
praw-dev/praw
4,168Vezi pe GitHub
PRAW este un wrapper Python pentru API-ul Reddit, funcționând ca un client REST API și crawler de date pentru rețelele sociale. Acesta oferă o interfață programatică pentru a prelua date, a gestiona conținutul contului și a interacționa cu platforma. Biblioteca implementează un client OAuth 2.0 complet, suportând multiple fluxuri de autorizare, inclusiv parole, token-uri implicite și de reîmprospătare, precum și acces guest read-only. Se distinge printr-un scheduler de cereri conștient de rate-limit, care monitorizează limitările serverului pentru a preveni epuizarea cotei API și utilizează obiecte de tip lazy-loading pentru a prelua date doar atunci când sunt accesate atribute specifice. Capabilitățile acoperă gestionarea comunităților și a utilizatorilor, streaming-ul în timp real al postărilor prin generatoare și extragerea thread-urilor de discuție imbricate. Setul de instrumente include, de asemenea, gestionarea conținutului pentru crearea de postări și comentarii, alături de opțiuni de configurare a rețelei pentru sesiuni personalizate și rutare prin proxy.
Provides recursive conversion of hierarchical discussion threads into linear lists of comment objects.
Pythonapioauthpraw
Vezi pe GitHub4,168
mikepenz/fastadapter
mikepenz/FastAdapter
3,877Vezi pe GitHub
FastAdapter is a framework for Android development designed to simplify the creation of complex list interfaces. It functions as a modular controller for list views, providing a system to bind data models to custom view templates while reducing the boilerplate code typically required for managing list adapters. The library distinguishes itself through an adapter composition pattern that allows developers to aggregate multiple independent data sources into a single unified list. It utilizes a type-safe registry to map data models to specific view holders and employs a centralized event dispatc
Transforms nested data structures into a linear list representation to allow efficient rendering of expandable content.
Kotlinadapterandroidandroid-development
Vezi pe GitHub3,877