5 dépôts
Libraries for constructing, filtering, and reshaping structured data.
Distinguishing note: Focuses on tabular data structures rather than general database management.
Explore 5 awesome GitHub repositories matching data & databases · Data Manipulation Frameworks. Refine with filters or upvote what's useful.
Pandas is a high-performance data analysis library that provides a comprehensive framework for manipulating, cleaning, and transforming structured datasets. It centers on labeled one-dimensional and two-dimensional data structures, allowing users to construct, filter, and reshape tabular information while performing complex arithmetic and logical operations. The library distinguishes itself through a sophisticated indexing engine that enables automatic data alignment during calculations and relational merges. By utilizing a block-based memory layout, it optimizes cache locality for vectorized
Provides high-level structures for manipulating and transforming two-dimensional labeled datasets.
Automa is a browser-based automation platform that enables users to build, schedule, and execute repetitive web tasks through a visual, no-code interface. By operating as a browser extension, it provides a canvas-based environment where users construct workflows by connecting functional blocks to interact with web elements, manage browser state, and process data. The platform distinguishes itself through its deep integration with the browser environment, allowing for complex orchestration such as event-driven triggers, cross-origin request handling, and the ability to package workflows as sta
Modifies and processes variables and tabular data to support dynamic automation logic.
Tushare is a financial data library for the Python programming environment that provides access to historical and real-time market information. It functions as a data interface for retrieving stock trading records, corporate financial statements, and macroeconomic indicators to support quantitative analysis and research. The library distinguishes itself by automatically transforming raw API responses into tabular data structures, allowing for direct integration with data analysis workflows. It manages access to these datasets through token-based authentication and utilizes schema-mapped parsi
Transforms API responses into tabular data structures for rapid analysis.
This repository is a collection of structured coding challenges designed to build proficiency in data manipulation, cleaning, and transformation using the Python data analysis library. It functions as a hands-on tutorial for learning how to process and analyze tabular datasets through a series of practical, real-world exercises. The project utilizes interactive documents that combine live code cells with narrative text, allowing users to execute data manipulation logic in a persistent environment. The content is organized into modular, progressive units that increase in complexity, enabling u
Provides structured exercises for cleaning, filtering, and transforming tabular data using industry-standard frameworks.
DataFrame is a C++ tabular data library and manipulation engine designed for managing heterogeneous data in contiguous memory. It functions as a statistical analysis framework and time series analysis toolkit, providing the means to store, index, and transform multidimensional datasets. The project distinguishes itself through a high-performance execution model that utilizes column-major storage, SIMD-aligned memory allocation, and a thread-pool for parallel computations. It employs a visitor-based algorithm dispatch system and policy-driven transformations to decouple data processing logic f
Implements a comprehensive framework for slicing, joining, merging, grouping, sorting, and filtering structured tabular data.