1 repository
Execution of mathematical operations and algorithms directly on GPU device memory.
Distinct from Pipeline GPU Execution: Existing candidates focus on synchronization or pipelines, not the execution of mathematical algorithms on device memory.
Explore 1 awesome GitHub repository matching scientific & mathematical computing · Device-Side Algorithm Execution. Refine with filters or upvote what's useful.
cuda-python provides low-level Python bindings for the CUDA Driver and Runtime APIs. It serves as a programmatic wrapper for controlling device memory, managing hardware toolchains, and orchestrating execution graphs on NVIDIA GPUs, allowing for the compilation and launching of parallel kernels directly from Python. The project enables the development of SIMT kernels and the execution of mathematical algorithms on device memory. It integrates pre-compiled bytecode as custom operators and interfaces with accelerated device libraries to access low-level hardware functions without leaving the la
Executes fast mathematical operations such as sorting and transforming directly on device memory.