awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Efficient Inference Engines · Awesome GitHub Repositories

1 repo

Awesome GitHub RepositoriesEfficient Inference Engines

Runtime environments and libraries optimized for executing large language models with minimal latency and memory usage.

Distinguishing note: Focuses on the engine-level execution of compressed models rather than model training or fine-tuning.

Explore 1 awesome GitHub repository matching artificial intelligence & ml · Efficient Inference Engines. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Efficient Inference Engines

Awesome Efficient Inference Engines GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • microsoft/BitNet

    microsoft/BitNet

    28,521View on GitHub↗

    BitNet is a quantized inference engine designed to execute highly compressed language models by performing arithmetic on low-precision, bit-level weight data. It functions as a model optimization toolkit and a high-performance kernel library, enabling the execution of large language models on consumer hardware by reducing memory footprints and increasing processing speeds. The project distinguishes itself through hardware-specific kernel optimizations that leverage native processor instructions to accelerate matrix multiplication. By utilizing packed integer arithmetic and memory-aligned weig

    Running compressed language models on consumer hardware by reducing memory usage and increasing processing speed during text generation.

    Python
    28,521View on GitHub↗