4 dépôts
Data streaming architectures that feed column-aware data directly into GPU buffers to avoid CPU-side exports.
Distinct from Data Streaming: Distinct from general data streaming: specifically focuses on the GPU-direct path to optimize ML training pipelines.
Explore 4 awesome GitHub repositories matching data & databases · GPU-Accelerated Data Streams. Refine with filters or upvote what's useful.
Rerun is a multimodal data visualizer and robotics data logger designed for rendering synchronized streams of 3D spatial data, images, and time-series metrics. It functions as a tool for capturing high-frequency sensor data and AI outputs into a queryable columnar format, providing a dedicated interface for viewing MCAP recording files and analyzing physical environments. The project distinguishes itself as a machine learning dataset streamer, capable of feeding logged recordings directly into GPU buffers and PyTorch training pipelines without intermediate exports. It supports a high-performa
Feeds column-aware and video-codec-aware data from recordings directly to GPUs to eliminate intermediate export steps.
Hub is a multimodal AI data lake and vector database designed for storing and querying embeddings, text, audio, and images. It functions as a dataset version control system and a machine learning data streaming engine to support large-scale model training. The system utilizes a serverless PostgreSQL vector store to index high-dimensional embeddings for semantic search. It provides a visual interface for inspecting multimodal datasets and viewing annotations such as bounding boxes and masks. The platform handles cloud-agnostic storage synchronization and implements lazy, compressed data strea
Streams compressed data arrays directly from cloud storage into deep learning frameworks to optimize training.
jetson-inference is a set of libraries and tools for executing optimized deep learning models on embedded GPU hardware. Its primary purpose is to enable real-time computer vision and AI inference at the edge with low latency and high throughput. The project distinguishes itself through high-performance streaming analytics and the ability to execute concurrent AI pipelines on auto-grade silicon. It provides specialized support for multi-sensor stream processing, utilizing zero-copy data transport to load camera frames directly into GPU memory. The codebase covers a broad surface of capabiliti
Implements high-throughput, low-latency data streaming to share GPU data between systems.
Filters, processes, and classifies large volumes of streaming cybersecurity data using a GPU-accelerated AI framework.