1 repo
High-performance engines optimized for transformer model inference.
Distinguishing note: None of the candidates provided; this is a core inference engine identity.
Explore 1 awesome GitHub repository matching artificial intelligence & ml · Transformer Inference Engines. Refine with filters or upvote what's useful.
Nanochat is a lightweight execution environment designed for training and running language models on standard consumer hardware. It functions as both a neural network training framework and an inference engine, enabling users to perform backpropagation-based training and model execution directly on general-purpose processors without the need for dedicated graphics hardware. The project distinguishes itself through a suite of optimization tools that prioritize efficiency on local machines. By utilizing memory-mapped weight loading and CPU-optimized vector math, it maximizes throughput for inte
Maximizes throughput for interactive language model sessions using memory-mapped loading and CPU-optimized math.