2 repos

Awesome GitHub RepositoriesInference Accelerators

Libraries and drivers that optimize machine learning model execution by utilizing specialized hardware.

Distinguishing note: Focuses on the hardware-level optimization of inference tasks rather than model training or general AI frameworks.

Explore 2 awesome GitHub repositories matching artificial intelligence & ml · Inference Accelerators. Refine with filters or upvote what's useful.

Find the best repos with AI.We'll search the best matching repositories with AI.

ggml-org/whisper.cpp
ggml-org/whisper.cpp
46,843View on GitHub
Whisper.cpp is a high-performance, local-first speech recognition engine designed to run large-scale machine learning models on consumer hardware. It functions as a portable library that converts audio into text, supporting both static file transcription and real-time stream processing. By utilizing a lightweight inference engine and weight quantization, the project minimizes memory and compute overhead, allowing for efficient execution without reliance on external cloud APIs or internet connectivity. The project distinguishes itself through a hardware-agnostic compute abstraction that offloa
Optimizing machine learning model execution by offloading heavy mathematical computations to specialized graphics cards and neural processing units.
C++inferenceopenaispeech-recognition
46,843View on GitHub
karpathy/nanochat
karpathy/nanochat
43,699View on GitHub
Nanochat is a lightweight execution environment designed for training and running language models on standard consumer hardware. It functions as both a neural network training framework and an inference engine, enabling users to perform backpropagation-based training and model execution directly on general-purpose processors without the need for dedicated graphics hardware. The project distinguishes itself through a suite of optimization tools that prioritize efficiency on local machines. By utilizing memory-mapped weight loading and CPU-optimized vector math, it maximizes throughput for inte
Utilizes low-level processor instructions to perform high-speed matrix operations without dedicated graphics hardware.
Python
43,699View on GitHub