awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Inference Accelerators · Awesome GitHub Repositories

2 repos

Awesome GitHub RepositoriesInference Accelerators

Libraries and drivers that optimize machine learning model execution by utilizing specialized hardware.

Distinguishing note: Focuses on the hardware-level optimization of inference tasks rather than model training or general AI frameworks.

Explore 2 awesome GitHub repositories matching artificial intelligence & ml · Inference Accelerators. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Inference Accelerators

Awesome Inference Accelerators GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • ggml-org/whisper.cpp

    ggml-org/whisper.cpp

    46,843View on GitHub↗

    Whisper.cpp is a high-performance, local-first speech recognition engine designed to run large-scale machine learning models on consumer hardware. It functions as a portable library that converts audio into text, supporting both static file transcription and real-time stream processing. By utilizing a lightweight inference engine and weight quantization, the project minimizes memory and compute overhead, allowing for efficient execution without reliance on external cloud APIs or internet connectivity. The project distinguishes itself through a hardware-agnostic compute abstraction that offloa

    Optimizing machine learning model execution by offloading heavy mathematical computations to specialized graphics cards and neural processing units.

    C++inferenceopenaispeech-recognition
    46,843View on GitHub↗
  • karpathy/nanochat

    karpathy/nanochat

    43,699View on GitHub↗

    Nanochat is a lightweight execution environment designed for training and running language models on standard consumer hardware. It functions as both a neural network training framework and an inference engine, enabling users to perform backpropagation-based training and model execution directly on general-purpose processors without the need for dedicated graphics hardware. The project distinguishes itself through a suite of optimization tools that prioritize efficiency on local machines. By utilizing memory-mapped weight loading and CPU-optimized vector math, it maximizes throughput for inte

    Utilizes low-level processor instructions to perform high-speed matrix operations without dedicated graphics hardware.

    Python
    43,699View on GitHub↗