4 repositorios
Operations to fill GPU buffers with constant values or zero them out after allocation.
Distinct from GPU Buffer Allocators: Distinct from GPU Buffer Allocators: focuses on populating buffers with constant data after allocation, not the allocation itself.
Explore 4 awesome GitHub repositories matching data & databases · Buffer Initialization Operations. Refine with filters or upvote what's useful.
This project is a learning guide and collection of study notes designed to teach Node.js backend development. It provides a comprehensive core API reference and practical demonstrations for implementing server-side logic, network programming, and system APIs. The guide specifically covers advanced technical domains including process management for scaling applications via clusters and child processes, as well as network programming for building TCP, UDP, and HTTP services. It also includes detailed instructional material on security implementation, focusing on cryptographic hashing and encryp
Provides a utility to populate a buffer with a repeating value for initialization.
gfx es una abstracción de API de gráficos agnóstica al hardware que traduce un conjunto unificado de comandos de gráficos y cómputo en instrucciones nativas para múltiples controladores de GPU. Proporciona una interfaz común para el renderizado multiplataforma y la programación de cómputo de GPU de propósito general. El proyecto cuenta con un sistema de traducción de shaders de representación intermedia que convierte el código fuente y SPIR-V en lenguajes específicos para el objetivo. Emplea un framework de pruebas de referencia basado en datos para verificar que la salida de gráficos permanezca consistente a través de diferentes plataformas de hardware. Las capacidades incluyen la codificación de buffers de comandos en paralelo a través de múltiples hilos y la encapsulación de estados de pipeline en objetos únicos para minimizar cambios de estado redundantes. El sistema gestiona recursos de GPU de bajo nivel, incluyendo asignación de memoria, mapeo de buffers asíncrono y presentación explícita de fotogramas a través de swapchains. La implementación apunta a entornos nativos y navegadores web a través de WebAssembly, proporcionando capas de traducción para WebGL y WebGL2.
Automatically clears GPU buffer memory upon allocation to ensure consistent state across different hardware platforms.
TileLang is a Python-embedded domain-specific language compiler that JIT-compiles and autotunes GPU kernels. It uses a tile-based DSL, automatic software pipelining, and parallel autotuning to generate optimized GPU kernels at runtime. It supports tensor core operations with Pythonic syntax, automatic memory management, and thread mapping. The compiler searches over tile sizes, thread counts, and scheduling policies, compiling and benchmarking candidates in parallel to find the fastest kernel. It also caches compiled binaries and tuning results to disk for reuse across sessions. TileLang inc
TVM's feature to fill every element of a buffer with a specified constant value, including a dedicated operation to zero it out.
FlashInfer is a library of high-performance GPU kernels purpose-built for accelerating large language model inference. It provides optimized implementations for attention operations (including flash attention, page attention, multi-head latent attention, and cascade attention) using paged key-value caches, fused kernel composition, and just-in-time compilation. The library also includes specialized kernels for mixture-of-experts layers, block-scaled low-precision quantization (FP8, FP4), and distributed collective communication. What distinguishes FlashInfer is its fused all-reduce communicat
Initializes device buffers for distributed synchronization protocols in multi-GPU inference.