What are the best Awesome Data Iterators GitHub Repositories?

Programming components that provide sequential access to elements within a large data collection during processing. Explore 21 awesome GitHub repositories matching data & databases · Data Iterators. Refine with filters or upvote what's useful. Top picks: kamranahmedse/developer-roadmap, deepfakes/faceswap, google/leveldb, immutable-js/immutable-js, huggingface/datasets, samber/lo, qax-os/excelize, rare-technologies/gensim, home-assistant/home-assistant.io, electronicarts/eastl.

Why is kamranahmedse/developer-roadmap a recommended Data Iterators GitHub Repositories repository?

Provides sequential access to elements within large data collections during processing.

Why is deepfakes/faceswap a recommended Data Iterators GitHub Repositories repository?

Serves as a base class for plugins to ingest and pass information through the extraction pipeline.

Why is google/leveldb a recommended Data Iterators GitHub Repositories repository?

Provides sequential iterators for traversing stored entries in forward or backward order.

Why is immutable-js/immutable-js a recommended Data Iterators GitHub Repositories repository?

Implements memory-efficient lazy iterators that defer data transformations until values are explicitly requested.

Why is huggingface/datasets a recommended Data Iterators GitHub Repositories repository?

Implements lazy, memory-efficient iterators to process large datasets on demand without loading them into physical memory.

Why is samber/lo a recommended Data Iterators GitHub Repositories repository?

Provides a comprehensive toolkit for memory-efficient, lazy data traversal and deferred computation of large or infinite sequences in Go.

Why is qax-os/excelize a recommended Data Iterators GitHub Repositories repository?

Emits data iteratively to maintain low memory usage during large-scale file processing.

Why is rare-technologies/gensim a recommended Data Iterators GitHub Repositories repository?

Implements data iterators to stream large text collections from disk, avoiding memory exhaustion.

Why is home-assistant/home-assistant.io a recommended Data Iterators GitHub Repositories repository?

Converts lazy sequences produced by filters into static lists to enable counting and sorting.

Why is electronicarts/eastl a recommended Data Iterators GitHub Repositories repository?

Provides standardized iterators for traversing diverse data collections without exposing underlying memory layouts.

21 مستودعات

Awesome GitHub RepositoriesData Iterators

Programming components that provide sequential access to elements within a large data collection during processing.

Explore 21 awesome GitHub repositories matching data & databases · Data Iterators. Refine with filters or upvote what's useful.

اعثر على أفضل المستودعات باستخدام الذكاء الاصطناعي.سنبحث عن أفضل المستودعات المطابقة باستخدام الذكاء الاصطناعي.

kamranahmedse/developer-roadmap
kamranahmedse/developer-roadmap
357,434عرض على GitHub
Developer Roadmap هي منصة يقودها المجتمع توفر مسارات تعليمية منظمة وقائمة على الرسوم البيانية لهندسة البرمجيات. تعمل كمستودع معرفي شامل حيث يتم تنظيم المجالات التقنية في تسلسلات مرئية لتوجيه اكتساب المهارات المهنية والنمو الوظيفي. يتميز المشروع بنظام بيئي تعاوني يتيح للمستخدمين المساهمة في خرائط الطريق، وتنظيم أفضل ممارسات الصناعة، والحفاظ على الملفات الشخصية المهنية. يدمج أطر تقييم تشخيصية لتقييم الكفاءة التقنية، مما يساعد المطورين على تحديد فجوات المعرفة والتحضير للمقابلات المهنية من خلال تسلسلات تعليمية مستهدفة. إلى جانب قدرات التخطيط الأساسية، توفر المنصة أفكاراً لمشاريع عملية ودروساً تفاعلية لتعزيز المفاهيم الهندسية. وتوفر مساحة مركزية للمجتمع لمشاركة الموارد، وتتبع تطوير المهارات التدريجي، والتنقل في المشاهد التقنية المعقدة.
Provides sequential access to elements within large data collections during processing.
TypeScriptangular-roadmapbackend-roadmapblockchain-roadmap
عرض على GitHub357,434
deepfakes/faceswap
deepfakes/faceswap
55,289عرض على GitHub
Faceswap is a comprehensive framework for automated media manipulation and neural face synthesis. It provides a modular pipeline that manages the entire lifecycle of facial feature extraction, deep learning model training, and image conversion. By coordinating complex computer vision workflows, the system enables users to map facial identities between source and destination datasets while maintaining structural alignment and lighting consistency across video frames. The project distinguishes itself through a highly extensible plugin-based architecture that handles hardware-accelerated process
Serves as a base class for plugins to ingest and pass information through the extraction pipeline.
Pythondeep-face-swapdeep-learningdeep-neural-networks
عرض على GitHub55,289
google/leveldb
google/leveldb
39,152عرض على GitHub
LevelDB is an embedded database library and persistent storage engine that provides a sorted key-value store. It uses a log-structured merge-tree architecture to map byte arrays to values, running directly within a process to provide storage without the need for a separate server process. The system is distinguished by its use of custom comparison functions to define key ordering, enabling efficient range scans and sequenced lookups. It ensures data reliability through atomic batch execution, consistent snapshot generation, and log-based recovery after failures. The engine covers broad capab
Provides sequential iterators for traversing stored entries in forward or backward order.
C++
عرض على GitHub39,152
immutable-js/immutable-js
immutable-js/immutable-js
33,060عرض على GitHub
Immutable.js is a library of persistent data structures and a functional state management toolkit. It provides a collection of immutable objects and arrays that prevent direct mutation to ensure predictable state management in JavaScript applications. The library utilizes structural sharing to efficiently create new versions of data without full copying and implements lazy sequence processing to chain data transformations that execute only when values are requested. It also supports batch mutation processing, allowing multiple changes to be applied to a temporary mutable copy before returning
Implements memory-efficient lazy iterators that defer data transformations until values are explicitly requested.
TypeScript
عرض على GitHub33,060
huggingface/datasets
huggingface/datasets
21,643عرض على GitHub
Datasets is a library designed for the management, processing, and sharing of large-scale data collections for machine learning workflows. It functions as both a data processing framework and a versioning platform, providing tools to organize, filter, and transform massive datasets while ensuring reproducibility across research and development teams. The library distinguishes itself by enabling the handling of datasets that exceed available system memory. It utilizes memory-mapped file access, disk-based caching, and lazy iterative streaming to maintain performance when working with large-sca
Implements lazy, memory-efficient iterators to process large datasets on demand without loading them into physical memory.
Pythonaiartificial-intelligencecomputer-vision
عرض على GitHub21,643
samber/lo
samber/lo
21,333عرض على GitHub
This library is a collection of generic utilities for the Go programming language designed to simplify the manipulation of slices and maps. It provides a functional toolkit that enables developers to perform data transformations, such as filtering, mapping, and reducing, while maintaining strict type safety through the use of language-level generics. The project distinguishes itself by offering a dual approach to data processing that balances functional programming patterns with performance-oriented execution. It supports both immutable functional pipelines for predictable state transitions a
Provides a comprehensive toolkit for memory-efficient, lazy data traversal and deferred computation of large or infinite sequences in Go.
Goconstraintscontractfilterable
عرض على GitHub21,333
qax-os/excelize
qax-os/excelize
20,682عرض على GitHub
Excelize is a library for reading and writing spreadsheet files in the Office Open XML format. It provides a comprehensive suite of tools for programmatically creating, modifying, and analyzing workbooks, worksheets, and cell data, ensuring compatibility across various office software suites through structured XML serialization. The library distinguishes itself with a built-in formula calculation engine that evaluates complex mathematical and logical expressions directly against workbook data. It also features a memory-mapped streaming architecture, which allows for the efficient processing o
Emits data iteratively to maintain low memory usage during large-scale file processing.
Goagentaianalytics
عرض على GitHub20,682
rare-technologies/gensim
RaRe-Technologies/gensim
16,442عرض على GitHub
Gensim is an unsupervised natural language processing toolkit designed for topic modeling, word embedding training, and the processing of large-scale text corpora. It provides a framework for discovering latent themes and semantic structures in text without the need for labeled data. The toolkit is distinguished by its ability to handle datasets that exceed system memory through iterator-based data streaming from disk. It also supports distributed model training, allowing complex modeling tasks to be executed across computer clusters. The library covers a broad range of analysis capabilities
Implements data iterators to stream large text collections from disk, avoiding memory exhaustion.
Python
عرض على GitHub16,442
home-assistant/home-assistant.io
home-assistant/home-assistant.io
9,466عرض على GitHub
Home Assistant is a local home automation platform and server that acts as an IoT device orchestrator. It integrates diverse smart home hardware by wrapping third-party APIs into a standardized logic layer and stores all system state and historical statistics on local hardware to eliminate cloud dependencies. The system functions as a Matter IoT controller and an MQTT home automation bridge, allowing for local interoperability between different manufacturers. It features a state-based entity model and an internal event bus that decouple physical device logic from system automation. The platf
Converts lazy sequences produced by filters into static lists to enable counting and sorting.
HTMLdocumentationhacktoberfesthass
عرض على GitHub9,466
electronicarts/eastl
electronicarts/EASTL
9,273عرض على GitHub
EASTL is a C++ Standard Template Library implementation consisting of containers, iterators, and algorithms. It provides cross-platform data structures and a template-based algorithm library designed for use in resource-constrained game engine environments. The library focuses on game engine memory management, providing specialized utilities that ensure predictable memory allocation and high-performance access for real-time applications. These containers maintain consistent behavior across different operating systems and hardware platforms. The project covers high-performance C++ development
Provides standardized iterators for traversing diverse data collections without exposing underlying memory layouts.
C++c-plus-plusc-plus-plus-11c-plus-plus-14
عرض على GitHub9,273
iamseancheney/python_for_data_analysis_2nd_chinese_version
iamseancheney/python_for_data_analysis_2nd_chinese_version
8,937عرض على GitHub
This project is an educational resource and a collection of instructional materials for performing data manipulation and statistical analysis using Python. It provides a comprehensive set of guides and code examples for using the Pandas, NumPy, and Matplotlib libraries to analyze structured data. The resource includes a dedicated guide for reshaping, cleaning, and aggregating tabular data and time series via Pandas, alongside a reference for high-performance vectorized operations and linear algebra using NumPy. It also features tutorials for creating publication-quality charts, distribution p
Uses generators to produce sequences of values on demand, reducing memory consumption for large datasets.
matplotlibnumpypandas
عرض على GitHub8,937
nodejs/nodejs.org
nodejs/nodejs.org
6,842عرض على GitHub
Node.js is an open-source, cross-platform JavaScript runtime environment built on the V8 engine, designed for executing JavaScript code outside a web browser. It operates as a server-side JavaScript platform with an event-driven, non-blocking I/O architecture that enables building scalable network applications and web servers. The runtime integrates the CommonJS module system for synchronous module loading and the npm ecosystem for sharing and reusing packages. The platform provides comprehensive capabilities for web server development, including creating HTTP and HTTPS servers, managing HTTP
Supports processing streaming data with async iterators for chunk-by-chunk consumption without full buffering.
TypeScriptnextjsnodenodejs
عرض على GitHub6,842
dtao/lazy.js
dtao/lazy.js
5,975عرض على GitHub
Lazy.js is a JavaScript library that implements a lazy evaluation model for processing collections and data streams. It defers all computation until iteration begins, building chains of transformations that execute only when values are consumed, avoiding intermediate arrays and buffering. The library wraps data sources into a uniform sequence interface, enabling operations like map and filter to be chained together without materializing intermediate results. The library extends lazy processing beyond simple collections to handle asynchronous data sources, DOM events, strings, and Node.js stre
Integrates with asynchronous data sources by yielding values at timed intervals or from streams without blocking.
JavaScript
عرض على GitHub5,975
hadley/r4ds
hadley/r4ds
5,070عرض على GitHub
r4ds هو منهج لعلوم البيانات ومورد تعليمي مصمم لإتقان لغة البرمجة R. يوفر مسار تعلم منظماً للعملية الشاملة لاستيراد البيانات، وتنظيمها، وتحويلها، وتصورها. يركز المشروع على دليل علوم البيانات القابل للتكرار ومنهج شامل لمعالجة البيانات. يتضمن دروساً تعليمية متخصصة حول قواعد الرسومات لتصور البيانات الطبقي والمنشورات التقنية التي تم إنشاؤها باستخدام Quarto والتي تمزج بين الكود القابل للتنفيذ والنثر السردي. تغطي المادة مجموعة واسعة من القدرات التحليلية، بما في ذلك استيعاب البيانات من مصادر متنوعة، وربط البيانات العلائقية، وإدارة المتغيرات الفئوية. كما تتناول تنظيف البيانات، والنمذجة الرياضية، وإنشاء تقارير وعروض تقديمية احترافية متعددة التنسيقات. يركز المنهج على التطبيق العملي للبرمجة الوظيفية ومبادئ البيانات المرتبة (Tidy data) لإنشاء تحليلات شفافة وقابلة للتكرار.
Demonstrates how to apply a consistent set of actions across data collections using functional programming.
R
عرض على GitHub5,070
pytoolz/toolz
pytoolz/toolz
5,117عرض على GitHub
Toolz is a Python library that implements functional programming utilities for iterable transformation, dictionary manipulation, function composition, and lazy evaluation. It provides a set of pure functions designed to work with Python's built-in data structures, enabling concise and composable data processing workflows. What distinguishes toolz is its support for curried partial application, allowing functions to be incrementally applied and reused. It includes dictionary-centric operations that handle nested structures, and offers iterable chain transformers that combine mapping, filtering
Processes sequences on-demand using generators for memory-efficient handling of large data streams.
Python
عرض على GitHub5,117
gajus/slonik
gajus/slonik
4,910عرض على GitHub
Slonik is a type-safe PostgreSQL client for Node.js that uses tagged template literals to ensure parameters are bound and protected against injection attacks. It provides a framework for connecting applications to PostgreSQL with automatic type checking for queries and database schemas. The project distinguishes itself through a specialized SQL query linter that detects invalid columns and type mismatches by verifying code against a live database schema during the development process. It also includes a high-performance binary bulk data inserter for loading large datasets using native binary
Provides memory-efficient processing of large database result sets using async iterable streams.
TypeScript
عرض على GitHub4,910
pytorch/ignite
pytorch/ignite
4,770عرض على GitHub
Ignite is a high-level training framework for PyTorch neural networks that serves as a training engine and deep learning lifecycle manager. It provides a structured system for organizing and automating training and evaluation loops, managing data iterators and triggering event handlers at specific milestones during the model training process. The project distinguishes itself through a comprehensive suite of tools for distributed training and model evaluation. It includes utilities for synchronizing gradients and coordinating collective communication across multiple GPUs or nodes, as well as a
Controls finite or infinite data streams by determining epoch lengths or restarting exhausted iterators.
Python
عرض على GitHub4,770
stripe/stripe-node
stripe/stripe-node
4,442عرض على GitHub
هذه مكتبة من جانب الخادم ذات أنواع (typed) وSDK لبوابة دفع لدمج Stripe في تطبيقات Node.js. توفر عميلاً ذا أنواع لإدارة المدفوعات والعملاء والاشتراكات، مع تقديم أدوات متخصصة لتنفيذ المعاملات المالية الآمنة وإدارة موارد الفوترة. تتميز المكتبة بعميل API متطابق (idempotent) يمنع العمليات المكررة باستخدام مفاتيح المطابقة ومنطق إعادة المحاولة المتسارع. تتضمن أداة تحقق من توقيع webhook للتأكد من أن إشعارات أحداث HTTPS الواردة أصلية، وغلاف ترقيم صفحات (pagination wrapper) يعتمد على async-iterator لاجتياز مجموعات البيانات الكبيرة. يغطي المشروع مجموعة واسعة من القدرات، بما في ذلك إدارة فوترة الاشتراكات، وتنسيق منصة الدفع للحسابات المتصلة، والبحث عن الموارد. يوفر معالجة شاملة للاستجابات من خلال توسيع الكائنات واختيار الحقول، إلى جانب ميزات أمنية لمصادقة طلبات API والتحقق من webhook. المكتبة مكتوبة بلغة TypeScript.
Uses JavaScript async iterators to stream paginated data from the API without buffering the entire payload.
TypeScript
عرض على GitHub4,442
xtensor-stack/xtensor
xtensor-stack/xtensor
3,748عرض على GitHub
xtensor is a C++ multidimensional array library for numerical computing that provides N-dimensional containers with an interface mirroring the NumPy API. It utilizes a lazy evaluation expression engine to defer numerical computations until assignment, which minimizes memory allocations and intermediate copies. The library features a foreign memory array adaptor that allows it to wrap external buffers, such as NumPy arrays, to perform numerical operations in-place without duplicating data. It further optimizes performance through lazy broadcasting and a system that manages the lifetime of temp
Provides memory-efficient, STL-compatible forward and reverse iterators to process tensor data.
C++c-plus-plus-14multidimensional-arraysnumpy
عرض على GitHub3,748
nvidia/cuda-python
NVIDIA/cuda-python
3,170عرض على GitHub
cuda-python provides low-level Python bindings for the CUDA Driver and Runtime APIs. It serves as a programmatic wrapper for controlling device memory, managing hardware toolchains, and orchestrating execution graphs on NVIDIA GPUs, allowing for the compilation and launching of parallel kernels directly from Python. The project enables the development of SIMT kernels and the execution of mathematical algorithms on device memory. It integrates pre-compiled bytecode as custom operators and interfaces with accelerated device libraries to access low-level hardware functions without leaving the la
Uses iterators to compute sequence elements on demand, minimizing the allocation of large intermediate arrays.
Cython
عرض على GitHub3,170

Awesome Data Iterators GitHub Repositories

kamranahmedse/developer-roadmap

deepfakes/faceswap

google/leveldb

immutable-js/immutable-js

huggingface/datasets

samber/lo

qax-os/excelize

RaRe-Technologies/gensim

home-assistant/home-assistant.io

electronicarts/EASTL

iamseancheney/python_for_data_analysis_2nd_chinese_version

nodejs/nodejs.org

dtao/lazy.js

hadley/r4ds

pytoolz/toolz

gajus/slonik

pytorch/ignite

stripe/stripe-node

xtensor-stack/xtensor

NVIDIA/cuda-python

استكشف الوسوم الفرعية