awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Model Inference Runtimes · Awesome GitHub Repositories

9 repos

Awesome GitHub RepositoriesModel Inference Runtimes

Software environments and engines optimized for executing machine learning models, distinct from general-purpose development frameworks.

Explore 9 awesome GitHub repositories matching artificial intelligence & ml · Model Inference Runtimes. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Artificial Intelligence & Machine Learning
  4. Model Inference Runtimes

Awesome Model Inference Runtimes GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • huggingface/transformers

    huggingface/transformers

    156,730GitHubView on GitHub↗

    Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering

    Pythonaudiodeep-learningdeepseek
  • Comfy-Org/ComfyUI

    Comfy-Org/ComfyUI

    103,654GitHubView on GitHub↗

    ComfyUI is a node-based generative AI orchestration engine designed for constructing, testing, and executing complex image and video synthesis pipelines. By utilizing a directed acyclic graph execution model, the platform allows users to build reproducible workflows through modular, interconnected processing blocks wit

    Pythonaicomfycomfyui
  • ggml-org/llama.cpp

    ggml-org/llama.cpp

    95,400GitHubView on GitHub↗

    Llama.cpp is an inference engine designed for the local execution of text-based and multimodal language models on consumer hardware. It provides a core environment for running models that process both text and image inputs, utilizing hardware-accelerated backends to optimize performance across diverse CPU and GPU archi

    C++ggml
  • hacksider/Deep-Live-Cam

    hacksider/Deep-Live-Cam

    79,568GitHubView on GitHub↗

    Deep-Live-Cam is a generative video transformation tool designed for real-time facial manipulation and cinematic enhancement. It functions as a local-first AI runtime, performing all media processing directly on the user's hardware to ensure complete data privacy without external network dependencies. By utilizing a hi

    Pythonaiai-deep-fakeai-face
  • nomic-ai/gpt4all

    nomic-ai/gpt4all

    77,146GitHubView on GitHub↗

    GPT4All is a cross-platform runtime environment designed to execute large language models directly on local consumer hardware. By leveraging an optimized C++ inference backend, it enables private, offline AI interactions without requiring an internet connection or external cloud services. The project provides a compreh

    C++ai-chatllm-inference
  • infiniflow/ragflow

    infiniflow/ragflow

    73,425GitHubView on GitHub↗

    This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasonin

    Pythonagentagenticagentic-ai
  • PaddlePaddle/PaddleOCR

    PaddlePaddle/PaddleOCR

    70,931GitHubView on GitHub↗

    PaddleOCR is a comprehensive optical character recognition framework designed for detecting and transcribing text from images and documents into structured, machine-readable formats. It provides a modular computer vision pipeline that decouples image preprocessing, text detection, and character recognition into indepen

    Pythonai4sciencechineseocrdocument-parsing
  • meta-llama/llama

    meta-llama/llama

    59,157GitHubView on GitHub↗

    Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on

    Python
  • ultralytics/yolov5

    ultralytics/yolov5

    56,830GitHubView on GitHub↗

    YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning

    Pythoncoremldeep-learningios

Explore sub-tags

  • Command Line Inference Interfaces1 sub-tagTerminal-based interfaces that allow users to interact with and manage model inference servers directly from the command line.
  • Inference API Servers1 sub-tagNetwork services that expose model inference capabilities through standardized web APIs to support automated application workflows.
  • Inference Runtimes6 sub-tagsExecution environments designed to load and run machine learning models for real-time or high-performance inference tasks.
Multimodal Inference Engines
Software engines capable of processing and generating outputs from multiple data types, such as text, images, and audio simultaneously.
  • Serving Endpoints1 sub-tagNetwork access points configured to manage how models are loaded and made available for incoming inference requests.
  • Text-Only Inference EnginesSpecialized engines optimized exclusively for processing and generating natural language text sequences.