3 Repos
Performing large-scale data manipulation and analysis tasks on GPU hardware for increased processing speed.
Distinct from GPU Acceleration: The candidates focus on process analysis, communication, or streaming, not general dataframe-style analysis.
Explore 3 awesome GitHub repositories matching data & databases · GPU-Accelerated Data Analysis. Refine with filters or upvote what's useful.
cuDF is a GPU-accelerated dataframe library and data processing engine designed for manipulating and analyzing large tabular datasets. It provides a high-level API for executing filtering, joining, and aggregating operations directly on GPU hardware. The project integrates the Apache Arrow memory format to enable zero-copy data transfers and includes a just-in-time compiler for executing custom user-defined functions on the GPU. The library features specialized acceleration for existing workflows by redirecting standard Pandas dataframe calls and Polars query plans to a GPU backend. It also p
Provides a high-level API for executing large-scale tabular filtering, joining, and aggregation directly on GPU hardware.
jetson-inference is a set of libraries and tools for executing optimized deep learning models on embedded GPU hardware. Its primary purpose is to enable real-time computer vision and AI inference at the edge with low latency and high throughput. The project distinguishes itself through high-performance streaming analytics and the ability to execute concurrent AI pipelines on auto-grade silicon. It provides specialized support for multi-sensor stream processing, utilizing zero-copy data transport to load camera frames directly into GPU memory. The codebase covers a broad surface of capabiliti
Accelerates large-scale data science workloads using GPU-to-GPU communication and shuffle operations.
Stumpy ist eine Python-Bibliothek für skalierbare Zeitreihenanalyse, die sich auf die Implementierung von Matrix-Profile-Algorithmen konzentriert. Sie bietet ein Framework zur Berechnung von Distanzprofilen, um wiederkehrende Muster und Anomalien innerhalb von Zeitreihendaten zu identifizieren. Das Projekt zeichnet sich durch seine Fähigkeit aus, rechenintensive Aufgaben über GPU-Hardware und verteilte Cluster mittels Dask zu skalieren. Es unterstützt multidimensionale Analysen zur Entdeckung von Motiven über gleichzeitige Datenströme hinweg und bietet inkrementelle Berechnungen für Echtzeit-Streaming-Analysen. Die Bibliothek deckt ein breites Spektrum an Zeitreihen-Mining-Techniken ab, einschließlich Motiv-Entdeckung, Anomalieerkennung und Sequenz-Musterabgleich. Sie bietet zudem Tools für semantische Segmentierung zur Erkennung von Regime-Änderungen und die Extraktion zeitlich geordneter Ketten ähnlicher Subsequenz-Muster.
Offloads complex matrix calculations to GPU hardware to significantly reduce processing time for large datasets.