awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Inference Runtimes · Awesome GitHub Repositories

6 repos

Awesome GitHub RepositoriesInference Runtimes

Execution environments designed to load and run machine learning models for real-time or high-performance inference tasks.

Explore 6 awesome GitHub repositories matching artificial intelligence & ml · Inference Runtimes. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Artificial Intelligence & Machine Learning
  4. Model Inference Runtimes
  5. Inference Runtimes

Awesome Inference Runtimes GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • huggingface/transformers

    huggingface/transformers

    156,730GitHubView on GitHub↗

    Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering

    Pythonaudiodeep-learningdeepseek
  • hacksider/Deep-Live-Cam

    hacksider/Deep-Live-Cam

    79,568GitHubView on GitHub↗

    Deep-Live-Cam is a generative video transformation tool designed for real-time facial manipulation and cinematic enhancement. It functions as a local-first AI runtime, performing all media processing directly on the user's hardware to ensure complete data privacy without external network dependencies. By utilizing a hi

    Pythonaiai-deep-fakeai-face
  • nomic-ai/gpt4all

    nomic-ai/gpt4all

    77,146GitHubView on GitHub↗

    GPT4All is a cross-platform runtime environment designed to execute large language models directly on local consumer hardware. By leveraging an optimized C++ inference backend, it enables private, offline AI interactions without requiring an internet connection or external cloud services. The project provides a compreh

    C++ai-chatllm-inference
  • PaddlePaddle/PaddleOCR

    PaddlePaddle/PaddleOCR

    70,931GitHubView on GitHub↗

    PaddleOCR is a comprehensive optical character recognition framework designed for detecting and transcribing text from images and documents into structured, machine-readable formats. It provides a modular computer vision pipeline that decouples image preprocessing, text detection, and character recognition into indepen

    Pythonai4sciencechineseocrdocument-parsing
  • meta-llama/llama

    meta-llama/llama

    59,157GitHubView on GitHub↗

    Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on

    Python
  • ultralytics/yolov5

    ultralytics/yolov5

    56,830GitHubView on GitHub↗

    YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning

    Pythoncoremldeep-learningios

Explore sub-tags

  • Edge Model Inference RuntimesLightweight runtimes optimized for edge device deployment.
  • High-Performance AI InferenceOptimized model execution for low-latency, real-time video manipulation on consumer hardware.
  • Inference Deployment EnginesSystems that facilitate the execution of trained models across diverse hardware backends including CPUs, GPUs, and mobile processors.
Local Inference Runners
Tools that execute model checkpoints on local hardware with configurable parameters.
  • Local-First AI RuntimesExecution environments that enable machine learning models to run locally on consumer hardware.
  • Real-Time Inference RuntimesHigh-speed execution environments designed for low-latency model inference and immediate data processing.