21 مستودعات
Programming components that provide sequential access to elements within a large data collection during processing.
Explore 21 awesome GitHub repositories matching data & databases · Data Iterators. Refine with filters or upvote what's useful.
Developer Roadmap هي منصة يقودها المجتمع توفر مسارات تعليمية منظمة وقائمة على الرسوم البيانية لهندسة البرمجيات. تعمل كمستودع معرفي شامل حيث يتم تنظيم المجالات التقنية في تسلسلات مرئية لتوجيه اكتساب المهارات المهنية والنمو الوظيفي. يتميز المشروع بنظام بيئي تعاوني يتيح للمستخدمين المساهمة في خرائط الطريق، وتنظيم أفضل ممارسات الصناعة، والحفاظ على الملفات الشخصية المهنية. يدمج أطر تقييم تشخيصية لتقييم الكفاءة التقنية، مما يساعد المطورين على تحديد فجوات المعرفة والتحضير للمقابلات المهنية من خلال تسلسلات تعليمية مستهدفة. إلى جانب قدرات التخطيط الأساسية، توفر المنصة أفكاراً لمشاريع عملية ودروساً تفاعلية لتعزيز المفاهيم الهندسية. وتوفر مساحة مركزية للمجتمع لمشاركة الموارد، وتتبع تطوير المهارات التدريجي، والتنقل في المشاهد التقنية المعقدة.
Provides sequential access to elements within large data collections during processing.
Faceswap is a comprehensive framework for automated media manipulation and neural face synthesis. It provides a modular pipeline that manages the entire lifecycle of facial feature extraction, deep learning model training, and image conversion. By coordinating complex computer vision workflows, the system enables users to map facial identities between source and destination datasets while maintaining structural alignment and lighting consistency across video frames. The project distinguishes itself through a highly extensible plugin-based architecture that handles hardware-accelerated process
Serves as a base class for plugins to ingest and pass information through the extraction pipeline.
LevelDB is an embedded database library and persistent storage engine that provides a sorted key-value store. It uses a log-structured merge-tree architecture to map byte arrays to values, running directly within a process to provide storage without the need for a separate server process. The system is distinguished by its use of custom comparison functions to define key ordering, enabling efficient range scans and sequenced lookups. It ensures data reliability through atomic batch execution, consistent snapshot generation, and log-based recovery after failures. The engine covers broad capab
Provides sequential iterators for traversing stored entries in forward or backward order.
Immutable.js is a library of persistent data structures and a functional state management toolkit. It provides a collection of immutable objects and arrays that prevent direct mutation to ensure predictable state management in JavaScript applications. The library utilizes structural sharing to efficiently create new versions of data without full copying and implements lazy sequence processing to chain data transformations that execute only when values are requested. It also supports batch mutation processing, allowing multiple changes to be applied to a temporary mutable copy before returning
Implements memory-efficient lazy iterators that defer data transformations until values are explicitly requested.
Datasets is a library designed for the management, processing, and sharing of large-scale data collections for machine learning workflows. It functions as both a data processing framework and a versioning platform, providing tools to organize, filter, and transform massive datasets while ensuring reproducibility across research and development teams. The library distinguishes itself by enabling the handling of datasets that exceed available system memory. It utilizes memory-mapped file access, disk-based caching, and lazy iterative streaming to maintain performance when working with large-sca
Implements lazy, memory-efficient iterators to process large datasets on demand without loading them into physical memory.
This library is a collection of generic utilities for the Go programming language designed to simplify the manipulation of slices and maps. It provides a functional toolkit that enables developers to perform data transformations, such as filtering, mapping, and reducing, while maintaining strict type safety through the use of language-level generics. The project distinguishes itself by offering a dual approach to data processing that balances functional programming patterns with performance-oriented execution. It supports both immutable functional pipelines for predictable state transitions a
Provides a comprehensive toolkit for memory-efficient, lazy data traversal and deferred computation of large or infinite sequences in Go.
Excelize is a library for reading and writing spreadsheet files in the Office Open XML format. It provides a comprehensive suite of tools for programmatically creating, modifying, and analyzing workbooks, worksheets, and cell data, ensuring compatibility across various office software suites through structured XML serialization. The library distinguishes itself with a built-in formula calculation engine that evaluates complex mathematical and logical expressions directly against workbook data. It also features a memory-mapped streaming architecture, which allows for the efficient processing o
Emits data iteratively to maintain low memory usage during large-scale file processing.
Gensim is an unsupervised natural language processing toolkit designed for topic modeling, word embedding training, and the processing of large-scale text corpora. It provides a framework for discovering latent themes and semantic structures in text without the need for labeled data. The toolkit is distinguished by its ability to handle datasets that exceed system memory through iterator-based data streaming from disk. It also supports distributed model training, allowing complex modeling tasks to be executed across computer clusters. The library covers a broad range of analysis capabilities
Implements data iterators to stream large text collections from disk, avoiding memory exhaustion.
Home Assistant is a local home automation platform and server that acts as an IoT device orchestrator. It integrates diverse smart home hardware by wrapping third-party APIs into a standardized logic layer and stores all system state and historical statistics on local hardware to eliminate cloud dependencies. The system functions as a Matter IoT controller and an MQTT home automation bridge, allowing for local interoperability between different manufacturers. It features a state-based entity model and an internal event bus that decouple physical device logic from system automation. The platf
Converts lazy sequences produced by filters into static lists to enable counting and sorting.
EASTL is a C++ Standard Template Library implementation consisting of containers, iterators, and algorithms. It provides cross-platform data structures and a template-based algorithm library designed for use in resource-constrained game engine environments. The library focuses on game engine memory management, providing specialized utilities that ensure predictable memory allocation and high-performance access for real-time applications. These containers maintain consistent behavior across different operating systems and hardware platforms. The project covers high-performance C++ development
Provides standardized iterators for traversing diverse data collections without exposing underlying memory layouts.
This project is an educational resource and a collection of instructional materials for performing data manipulation and statistical analysis using Python. It provides a comprehensive set of guides and code examples for using the Pandas, NumPy, and Matplotlib libraries to analyze structured data. The resource includes a dedicated guide for reshaping, cleaning, and aggregating tabular data and time series via Pandas, alongside a reference for high-performance vectorized operations and linear algebra using NumPy. It also features tutorials for creating publication-quality charts, distribution p
Uses generators to produce sequences of values on demand, reducing memory consumption for large datasets.
Node.js is an open-source, cross-platform JavaScript runtime environment built on the V8 engine, designed for executing JavaScript code outside a web browser. It operates as a server-side JavaScript platform with an event-driven, non-blocking I/O architecture that enables building scalable network applications and web servers. The runtime integrates the CommonJS module system for synchronous module loading and the npm ecosystem for sharing and reusing packages. The platform provides comprehensive capabilities for web server development, including creating HTTP and HTTPS servers, managing HTTP
Supports processing streaming data with async iterators for chunk-by-chunk consumption without full buffering.
Lazy.js is a JavaScript library that implements a lazy evaluation model for processing collections and data streams. It defers all computation until iteration begins, building chains of transformations that execute only when values are consumed, avoiding intermediate arrays and buffering. The library wraps data sources into a uniform sequence interface, enabling operations like map and filter to be chained together without materializing intermediate results. The library extends lazy processing beyond simple collections to handle asynchronous data sources, DOM events, strings, and Node.js stre
Integrates with asynchronous data sources by yielding values at timed intervals or from streams without blocking.
r4ds هو منهج لعلوم البيانات ومورد تعليمي مصمم لإتقان لغة البرمجة R. يوفر مسار تعلم منظماً للعملية الشاملة لاستيراد البيانات، وتنظيمها، وتحويلها، وتصورها. يركز المشروع على دليل علوم البيانات القابل للتكرار ومنهج شامل لمعالجة البيانات. يتضمن دروساً تعليمية متخصصة حول قواعد الرسومات لتصور البيانات الطبقي والمنشورات التقنية التي تم إنشاؤها باستخدام Quarto والتي تمزج بين الكود القابل للتنفيذ والنثر السردي. تغطي المادة مجموعة واسعة من القدرات التحليلية، بما في ذلك استيعاب البيانات من مصادر متنوعة، وربط البيانات العلائقية، وإدارة المتغيرات الفئوية. كما تتناول تنظيف البيانات، والنمذجة الرياضية، وإنشاء تقارير وعروض تقديمية احترافية متعددة التنسيقات. يركز المنهج على التطبيق العملي للبرمجة الوظيفية ومبادئ البيانات المرتبة (Tidy data) لإنشاء تحليلات شفافة وقابلة للتكرار.
Demonstrates how to apply a consistent set of actions across data collections using functional programming.
Toolz is a Python library that implements functional programming utilities for iterable transformation, dictionary manipulation, function composition, and lazy evaluation. It provides a set of pure functions designed to work with Python's built-in data structures, enabling concise and composable data processing workflows. What distinguishes toolz is its support for curried partial application, allowing functions to be incrementally applied and reused. It includes dictionary-centric operations that handle nested structures, and offers iterable chain transformers that combine mapping, filtering
Processes sequences on-demand using generators for memory-efficient handling of large data streams.
Slonik is a type-safe PostgreSQL client for Node.js that uses tagged template literals to ensure parameters are bound and protected against injection attacks. It provides a framework for connecting applications to PostgreSQL with automatic type checking for queries and database schemas. The project distinguishes itself through a specialized SQL query linter that detects invalid columns and type mismatches by verifying code against a live database schema during the development process. It also includes a high-performance binary bulk data inserter for loading large datasets using native binary
Provides memory-efficient processing of large database result sets using async iterable streams.
Ignite is a high-level training framework for PyTorch neural networks that serves as a training engine and deep learning lifecycle manager. It provides a structured system for organizing and automating training and evaluation loops, managing data iterators and triggering event handlers at specific milestones during the model training process. The project distinguishes itself through a comprehensive suite of tools for distributed training and model evaluation. It includes utilities for synchronizing gradients and coordinating collective communication across multiple GPUs or nodes, as well as a
Controls finite or infinite data streams by determining epoch lengths or restarting exhausted iterators.
هذه مكتبة من جانب الخادم ذات أنواع (typed) وSDK لبوابة دفع لدمج Stripe في تطبيقات Node.js. توفر عميلاً ذا أنواع لإدارة المدفوعات والعملاء والاشتراكات، مع تقديم أدوات متخصصة لتنفيذ المعاملات المالية الآمنة وإدارة موارد الفوترة. تتميز المكتبة بعميل API متطابق (idempotent) يمنع العمليات المكررة باستخدام مفاتيح المطابقة ومنطق إعادة المحاولة المتسارع. تتضمن أداة تحقق من توقيع webhook للتأكد من أن إشعارات أحداث HTTPS الواردة أصلية، وغلاف ترقيم صفحات (pagination wrapper) يعتمد على async-iterator لاجتياز مجموعات البيانات الكبيرة. يغطي المشروع مجموعة واسعة من القدرات، بما في ذلك إدارة فوترة الاشتراكات، وتنسيق منصة الدفع للحسابات المتصلة، والبحث عن الموارد. يوفر معالجة شاملة للاستجابات من خلال توسيع الكائنات واختيار الحقول، إلى جانب ميزات أمنية لمصادقة طلبات API والتحقق من webhook. المكتبة مكتوبة بلغة TypeScript.
Uses JavaScript async iterators to stream paginated data from the API without buffering the entire payload.
xtensor is a C++ multidimensional array library for numerical computing that provides N-dimensional containers with an interface mirroring the NumPy API. It utilizes a lazy evaluation expression engine to defer numerical computations until assignment, which minimizes memory allocations and intermediate copies. The library features a foreign memory array adaptor that allows it to wrap external buffers, such as NumPy arrays, to perform numerical operations in-place without duplicating data. It further optimizes performance through lazy broadcasting and a system that manages the lifetime of temp
Provides memory-efficient, STL-compatible forward and reverse iterators to process tensor data.
cuda-python provides low-level Python bindings for the CUDA Driver and Runtime APIs. It serves as a programmatic wrapper for controlling device memory, managing hardware toolchains, and orchestrating execution graphs on NVIDIA GPUs, allowing for the compilation and launching of parallel kernels directly from Python. The project enables the development of SIMT kernels and the execution of mathematical algorithms on device memory. It integrates pre-compiled bytecode as custom operators and interfaces with accelerated device libraries to access low-level hardware functions without leaving the la
Uses iterators to compute sequence elements on demand, minimizing the allocation of large intermediate arrays.