This project is a high-performance C++ and CUDA neural network library designed for fast training and inference of small networks on NVIDIA GPUs. It serves as a specialized backend for neural radiance fields and coordinate-based networks, providing a fused GPU kernel library and a hash grid encoder for transforming raw input dimensions into high-dimensional representations.
The library distinguishes itself through the use of C++ template metaprogramming and fused-kernel execution, which merge neural network layers into single GPU device functions to eliminate memory bottlenecks. It leverages tensor-core accelerated GEMM for high-throughput linear algebra and implements multiresolution hash encoding and spherical harmonic encoding to capture fine spatial and angular details.
The system covers a broad range of capabilities including 3D scene reconstruction, signed distance function implementation, and path radiance caching. It includes a comprehensive suite of training tools for weight optimization and loss calculation, as well as utilities for environment lighting approximation and material decomposition.
Low-level CUDA implementations and fast multilayer perceptrons are exposed as extensions for use within Python environments via a PyTorch C++ extension.