17 रिपॉजिटरी
Mechanisms for appending computed results as new columns to tabular data structures.
Distinct from Distributed Dataframes: Existing candidates focus on disk storage or distributed dataframes, not the specific act of adding columns to an in-memory pandas DataFrame.
Explore 17 awesome GitHub repositories matching data & databases · DataFrame Integration. Refine with filters or upvote what's useful.
Perspective is a columnar data analytics engine and high-performance visualization component powered by WebAssembly. It provides a system for analyzing and visualizing large or streaming datasets through interactive data grids and charts, utilizing a compiled binary to achieve near-native performance within the browser. The project distinguishes itself through a WebSocket-based data streaming interface and deep Apache Arrow integration, which minimize memory overhead when synchronizing tables between servers and clients. It acts as a remote query proxy capable of translating visualization con
Converts pandas or polars DataFrame objects into internal high-performance tables while preserving indexing.
FastUI is a server-driven UI system and Pydantic UI framework that transforms backend data models into functional web interfaces. It operates as a model-based frontend generator where the server controls the layout and behavior of the user interface through structured data schemas, enabling a low-code approach to web development. The project allows for the definition of visual hierarchies and component properties on the backend, using a JSON-based protocol to communicate UI structure between the server and client. It utilizes schema-driven generation to automate the creation of interfaces, in
Displays tabular data from models with configurable columns, interactive links, and formatted fields.
Jeesite is a full-stack low-code development framework designed for building enterprise administrative portals using Spring Boot, MyBatis, and Vue. It functions as a comprehensive platform for creating administrative dashboards with integrated role-based access control and organizational data permission systems. The framework distinguishes itself through a combination of automated CRUD code generation and an integrated RAG platform that connects large language models to enterprise data via vector stores. It further incorporates a BPMN-based workflow engine to automate complex business process
Provides interactive data tables featuring sorting, pagination, and frozen columns for efficient administrative data management.
Mesop is a stateful, declarative Python web UI framework and component library designed for building interactive web applications and AI demos. It allows for the construction of data-driven interfaces and chat systems using only Python, removing the need to write separate HTML or CSS. The framework is specifically tailored for AI application development, offering dedicated tools for conversational UI design and the creation of dashboards for large language model applications. It distinguishes itself with a visual UI editor for real-time property adjustments and the ability to embed custom Jav
Renders data frames as interactive tables with sticky headers, columns, and clickable cells.
Pinot is a distributed, columnar analytical database designed for high-concurrency, low-latency query processing. It functions as a real-time OLAP datastore, enabling interactive, user-facing analytics by ingesting and querying massive datasets from both streaming and batch sources. The system architecture relies on a centralized controller for cluster coordination and a distributed segment-based storage model to ensure horizontal scalability. The platform distinguishes itself through a hybrid ingestion pipeline that unifies real-time event streams and historical batch data into a single quer
Converts Spark DataFrames into offline segment files and writes them to a specified filesystem path for ingestion.
dtale is a web-based interactive grid and visualizer for pandas dataframes, designed as an exploratory data analysis tool. It provides a browser-based interface for analyzing tabular data structures, allowing users to calculate statistics, detect outliers, and compute correlations without writing manual code. The project functions as an embedded data viewer that can be integrated into web applications via iframes or custom routes, with specific support for Django, Flask, and Streamlit. It enables the exploration of datasets through a combination of an interactive data grid and a data visualiz
Connects to high-performance ArcticDB datastores to load and filter large-scale dataframes.
This is a pandas-based technical analysis library and financial feature engineering tool. It serves as a vectorized indicator calculator that transforms raw price and volume data into derived metrics for time series analysis. The library uses a NumPy-based engine to perform mathematical operations across entire arrays, avoiding iterative loops to maintain high performance. It organizes technical indicators into a modular class hierarchy with a consistent interface, allowing for bulk feature generation and the direct appending of results as new columns to a pandas DataFrame. The system covers
Appends computed indicator results as new columns to a pandas DataFrame to maintain time series alignment.
Mimesis is a Python synthetic data generator used to create realistic fake datasets and mock data for software testing and development. It functions as a schema-based dataset generator capable of producing structured records and relational datasets, while also serving as a production data anonymizer to replace sensitive information with synthetic values. The library distinguishes itself through comprehensive multilingual support, allowing for the generation of locale-specific information to simulate regional user profiles. It ensures reproducibility through deterministic data generation using
Generates synthetic columns for use in tabular data structures like pandas DataFrames.
statsforecast is a high-performance statistical time series forecasting library designed to generate point forecasts and prediction intervals. It functions as a distributed time series framework that utilizes a C-based forecasting engine and an automated model selector to identify and fit the optimal statistical model for every unique series in a dataset. The system also includes a time series anomaly detector to identify unusual data points by comparing observed values against probabilistic forecast intervals. The project is distinguished by its ability to handle massive-scale parallel forec
Integrates with Polars data structures to accelerate memory management and processing during forecasting.
Plotnine 'Grammar of Graphics' पर आधारित Python के लिए एक डेटा विज़ुअलाइज़ेशन लाइब्रेरी है। यह एक घोषणात्मक (declarative) सांख्यिकीय प्लॉटिंग फ्रेमवर्क और मल्टी-पैनल प्लॉटिंग इंजन के रूप में कार्य करता है, जो यूज़र्स को डेटा वेरिएबल्स को स्थिति, रंग और आकार जैसे विज़ुअल गुणों से मैप करके जटिल चार्ट बनाने की अनुमति देता है। प्रोजेक्ट को इसके लेयर्ड कंपोजिशन मॉडल और सांख्यिकीय ट्रांसफॉर्मेशन इंजन द्वारा प्रतिष्ठित किया गया है जो विज़ुअल्स रेंडर करने से पहले एग्रीगेशन और गणना करता है। इसमें मल्टी-पैनल फैसेटिंग के लिए एक व्यापक सिस्टम है, जो श्रेणीबद्ध वेरिएबल्स के आधार पर एक एकल विज़ुअलाइज़ेशन को सब-प्लॉट्स के ग्रिड में विभाजित करने में सक्षम बनाता है। लाइब्रेरी वितरण, क्षेत्र और स्कैटर प्लॉट्स के लिए विविध ज्यामितीय प्रतिनिधित्व, और भौगोलिक सीमाओं को रेंडर करने के लिए जियोस्पेशियल विज़ुअलाइज़ेशन सहित क्षमताओं की एक विस्तृत श्रृंखला को कवर करती है। यह डेटा-संचालित तत्वों को गैर-डेटा सौंदर्य गुणों से अलग करने के लिए स्केल मैपिंग, कोऑर्डिनेट प्रोजेक्शन और थीम-आधारित स्टाइलिंग के लिए व्यापक टूल प्रदान करती है। फ्रेमवर्क रेंडरिंग के लिए Matplotlib बैकएंड का उपयोग करता है और पाइपिंग ऑपरेशंस के माध्यम से टैबुलर डेटाफ़्रेम के साथ एकीकृत होता है।
Integrates tabular dataframes via piping operations, converting external pandas or polars objects into internal plotting formats.
aws-sdk-pandas एक Python लाइब्रेरी है जो pandas डेटाफ्रेम्स को AWS सेवाओं के साथ एकीकृत करती है, जो एक क्लाउड डेटा ETL टूल और डेटा लेक कनेक्टर के रूप में कार्य करती है। यह इन-मेमोरी डेटाफ्रेम्स और क्लाउड स्टोरेज, डेटाबेस और डेटा वेयरहाउस के बीच डेटा को स्थानांतरित और बदलने के लिए एक एकीकृत इंटरफेस प्रदान करती है। प्रोजेक्ट एक डिस्ट्रीब्यूटेड कंप्यूट ऑर्केस्ट्रेटर के रूप में खुद को अलग करता है जो एक मशीन की मेमोरी से अधिक डेटासेट्स को संभालने के लिए EMR क्लस्टर्स और सर्वरलेस प्रोसेसिंग वातावरण में pandas-आधारित वर्कफ़्लो सबमिट करने में सक्षम है। यह एक मशीन की मेमोरी से अधिक डेटासेट्स को संभालने के लिए Ray क्लस्टर इनिशियलाइज़ेशन के माध्यम से डिस्ट्रीब्यूटेड डेटा प्रोसेसिंग के समन्वय में और विशेषज्ञता रखता है। लाइब्रेरी क्षमताओं की एक विस्तृत श्रृंखला को कवर करती है, जिसमें S3 के लिए ऑब्जेक्ट स्टोरेज प्रबंधन, Athena और Redshift के लिए SQL क्वेरी निष्पादन, और NoSQL, ग्राफ और टाइम-सीरीज डेटाबेस के साथ एकीकरण शामिल है। इसमें Glue कैटलॉग के माध्यम से मेटाडेटा प्रबंधन, OpenSearch डेटा इंडेक्सिंग, और QuickSight में बिजनेस इंटेलिजेंस एसेट्स का प्रबंधन करने के लिए यूटिलिटीज भी शामिल हैं। अतिरिक्त कार्यक्षमता में सीक्रेट्स प्राप्त करना, CloudWatch लॉग्स का विश्लेषण करना और डेटा गुणवत्ता नियम सेट प्रबंधित करना शामिल है।
Wraps multiple cloud service APIs to convert remote query results directly into Pandas dataframes.
dcat-admin is a Laravel admin panel framework used to rapidly build data-driven administration interfaces. It functions as a CRUD generator and backend scaffolding tool that automatically produces create, read, update, and delete interfaces based on database table schemas. The system distinguishes itself through a plugin-based extension architecture and the ability to run multiple independent administrative instances within a single installation. It provides specialized tools for mapping external APIs to forms and tables, as well as an event-driven form lifecycle for executing custom logic du
Renders database records in an expandable tree structure with lazy-loading for child nodes.
Vizro is a low-code Python framework for building production-ready data visualization applications. It functions as a UI orchestrator that allows users to define multi-page analytical dashboards through structured configurations in Python, YAML, or JSON, reducing the need for extensive frontend engineering. The project distinguishes itself through generative AI integration, utilizing a model context protocol server to translate natural language descriptions into validated dashboard configurations, charts, and layouts. It also features a decoupled data cataloging system that separates data sou
Displays dataframes in interactive tables with pre-configured sorting and pagination.
This project is a Python library that wraps official NBA endpoints to retrieve player, team, and game statistics as structured data. It serves as a programmatic interface for fetching professional basketball league records and real-time scoreboards via HTTP requests. The library integrates with Pandas to transform raw JSON responses from sports servers into DataFrames for statistical analysis and data science. It functions as a data retrieval utility for tracking league-wide performance trends and scouting professional basketball players. The tool covers a broad range of capabilities includi
Transforms raw JSON responses from sports servers into Pandas DataFrames for statistical analysis and data science.
This is a structured deep learning curriculum for programmers, delivered as a collection of Jupyter notebooks. It teaches the fundamentals of training neural networks for computer vision, natural language processing, tabular data analysis, and collaborative filtering using PyTorch and the fastai library. The course is designed to be hands-on, guiding learners from building a training loop from scratch to fine-tuning pretrained models for a variety of practical tasks. The curriculum distinguishes itself by covering the full lifecycle of a deep learning project, from data preparation and augmen
Reads column values from DataFrame rows as labels for supervised learning tasks.
This project is a collection of accessible, reusable interface components built for the Svelte framework. It functions as a comprehensive design system implementation, providing a standardized toolkit for constructing responsive and inclusive user interfaces that adhere to established design language and accessibility guidelines. The library distinguishes itself through a deep integration with the Svelte framework, utilizing compiler-based transformations to optimize component rendering and reactive state synchronization. It features a robust theme management system that applies visual styles
Renders structured datasets into sortable, interactive tables with defined headers and row identifiers.
React Base Table is a library of reusable interface components designed for building complex, responsive data grids within web applications. It provides a high-performance foundation for rendering large datasets by utilizing window-based row virtualization, which ensures the user interface remains responsive even when displaying extensive collections of data. The library distinguishes itself through flexible layout and navigation capabilities, including support for hierarchical data structures that can be rendered as expandable tree rows. It allows for precise control over table geometry thro
Organizes and renders nested data structures as expandable tree rows to allow exploration of parent-child relationships.