4 dépôts
Using tabular data structures to perform numerical transformations and filtering for insights.
Distinct from Distributed Dataframe Analysis: General data analysis using DataFrames, whereas Distributed Dataframe Analysis focuses specifically on Spark/cluster environments.
Explore 4 awesome GitHub repositories matching data & databases · DataFrame Analysis. Refine with filters or upvote what's useful.
Perspective is a columnar data analytics engine and high-performance visualization component powered by WebAssembly. It provides a system for analyzing and visualizing large or streaming datasets through interactive data grids and charts, utilizing a compiled binary to achieve near-native performance within the browser. The project distinguishes itself through a WebSocket-based data streaming interface and deep Apache Arrow integration, which minimize memory overhead when synchronizing tables between servers and clients. It acts as a remote query proxy capable of translating visualization con
Exposes in-memory Polars DataFrames to browser clients over a WebSocket connection for remote analysis.
Pixie is an open-source observability platform for Kubernetes that uses eBPF to automatically capture telemetry data from clusters without requiring any manual instrumentation or code changes. It functions as an eBPF telemetry collector, a continuous application profiler, a network traffic analyzer, and a scriptable telemetry query engine, all within a single Kubernetes-native tool. The platform distinguishes itself through several integrated capabilities. It continuously samples stack traces from compiled-language code to identify CPU performance bottlenecks, visualizing the results as inter
Transforms tabular telemetry data through immutable dataframe operations for observability analysis.
Ce projet est une bibliothèque d'analyse de données Python et un framework d'analyse exploratoire de données conçu pour traiter des jeux de données bruts. Il fournit une suite d'outils pour examiner les données, identifier les anomalies et appliquer des méthodes statistiques pour découvrir des modèles. Le dépôt fonctionne comme une boîte à outils de modélisation de machine learning et une suite de modélisation statistique de données. Il inclut des algorithmes prédictifs et des modèles mathématiques utilisés pour analyser les relations entre les variables de données et tirer des enseignements de jeux de données complexes. Le projet couvre un large éventail de capacités, notamment la science des données, la modélisation par machine learning et l'analyse exploratoire de données. Celles-ci sont implémentées via la manipulation de données, le calcul numérique et la visualisation de données.
Provides capabilities to perform numerical transformations and filtering on tabular data structures to derive insights.
This is a comprehensive Python programming course and technical curriculum designed to take users from foundational syntax to advanced development patterns. It serves as a multi-disciplinary educational suite covering programming fundamentals, object-oriented design, and data analysis. The project provides specialized guides on professional development techniques, including the use of decorators, generators for memory management, and dunder-method operator overloading. It also includes instructional material on executing parallel tasks through concurrency and multiprocessing to reduce executi
Provides a suite for loading structured datasets and performing numerical transformations using DataFrames.