awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Deployment & Serving · Awesome GitHub Repositories

25 repos

Awesome GitHub RepositoriesDeployment & Serving

Explore 25 awesome GitHub repositories matching artificial intelligence & ml · Deployment & Serving. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Machine Learning
  4. Infrastructure
  5. Deployment & Serving

Awesome Deployment & Serving GitHub Repositories

Describe the repository you're looking for…
We'll search the best matching repositories with AI.
  • tensorflow/tensorflow

    tensorflow/tensorflow

    193,864GitHubView on GitHub↗

    TensorFlow is a comprehensive machine learning framework designed for the construction, training, and deployment of complex mathematical models. It utilizes a graph-based execution model that represents operations as directed acyclic graphs, enabling automatic differentiation and efficient parallel processing. The syst

    Standardizes the toolchain for serializing, optimizing, and serving machine learning models within high-performance production environments.

    C++deep-learningdeep-neural-networksdistributed
  • AUTOMATIC1111/stable-diffusion-webui

    AUTOMATIC1111/stable-diffusion-webui

    160,701GitHubView on GitHub↗

    Stable Diffusion Web UI is a browser-based interface designed for managing text-to-image generation tasks. It provides a centralized dashboard for controlling generative processes, including native support for multi-stage model architectures to facilitate high-quality image refinement. The platform distinguishes itsel

    Walks through the configuration steps required to run the application within the Windows Subsystem for Linux.

    Pythonaiai-artdeep-learning
  • huggingface/transformers

    huggingface/transformers

    156,730GitHubView on GitHub↗

    Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering

    Exports models into a portable format with ahead-of-time memory planning and hardware-specific operation dispatch for edge device inference.

    Pythonaudiodeep-learningdeepseek
  • Comfy-Org/ComfyUI

    Comfy-Org/ComfyUI

    103,654GitHubView on GitHub↗

    ComfyUI is a node-based generative AI orchestration engine designed for constructing, testing, and executing complex image and video synthesis pipelines. By utilizing a directed acyclic graph execution model, the platform allows users to build reproducible workflows through modular, interconnected processing blocks wit

    Serves visual, node-based generative pipelines as programmable API endpoints for integration into external software.

    Pythonaicomfycomfyui
  • deepseek-ai/DeepSeek-V3

    deepseek-ai/DeepSeek-V3

    101,631GitHubView on GitHub↗

    DeepSeek-V3 is a large language model that provides comprehensive resources for model utilization, including technical specifications, pre-trained weights, and evaluation benchmarks. The project details the core transformer architecture, including parameter counts and multi-token prediction modules, while supporting na

    Downloadable parameter files and technical configurations enable direct integration of the pre-trained model into custom environments.

    Python
  • ggml-org/llama.cpp

    ggml-org/llama.cpp

    95,400GitHubView on GitHub↗

    Llama.cpp is an inference engine designed for the local execution of text-based and multimodal language models on consumer hardware. It provides a core environment for running models that process both text and image inputs, utilizing hardware-accelerated backends to optimize performance across diverse CPU and GPU archi

    Executes large language models locally on standard consumer hardware with high performance.

    C++ggml
  • hacksider/Deep-Live-Cam

    hacksider/Deep-Live-Cam

    79,568GitHubView on GitHub↗

    Deep-Live-Cam is a generative video transformation tool designed for real-time facial manipulation and cinematic enhancement. It functions as a local-first AI runtime, performing all media processing directly on the user's hardware to ensure complete data privacy without external network dependencies. By utilizing a hi

    Optimizes generative models for low-latency, real-time inference on consumer-grade hardware.

    Pythonaiai-deep-fakeai-face
  • browser-use/browser-use

    browser-use/browser-use

    78,576GitHubView on GitHub↗

    Browser-use is a framework for building autonomous agents that navigate, interact with, and extract data from web interfaces using natural language instructions. By acting as an orchestration layer between large language models and browser automation protocols, it enables the execution of complex, multi-step workflows

    Adjusts operational behavior and inference parameters for Llama models to optimize their performance in web-based reasoning tasks.

    Pythonai-agentsai-toolsbrowser-automation
  • hoppscotch/hoppscotch

    hoppscotch/hoppscotch

    77,888GitHubView on GitHub↗

    Hoppscotch is an open-source API development ecosystem designed for building, testing, and debugging REST, GraphQL, and real-time APIs. It provides a unified platform that functions across web browsers, desktop applications, and command-line interfaces, allowing developers to manage the entire API lifecycle from a sing

    Configures AI-driven assistance to generate payloads and automate test script creation.

    TypeScriptapiapi-clientapi-rest
  • nomic-ai/gpt4all

    nomic-ai/gpt4all

    77,146GitHubView on GitHub↗

    GPT4All is a cross-platform runtime environment designed to execute large language models directly on local consumer hardware. By leveraging an optimized C++ inference backend, it enables private, offline AI interactions without requiring an internet connection or external cloud services. The project provides a compreh

    Enables private, offline inference by running large language models directly on local hardware resources.

    C++ai-chatllm-inference
  • zed-industries/zed

    zed-industries/zed

    75,634GitHubView on GitHub↗

    Zed is an AI-native, high-performance code editor designed for extreme responsiveness and keyboard-centric workflows. It functions as an extensible text processing workspace that integrates autonomous agents and predictive models directly into the development environment to automate complex engineering tasks, refactori

    Runs machine learning models on local hardware to ensure data privacy and reduce latency for AI-assisted coding tasks.

    Rustgpuirust-langtext-editor
  • mlabonne/llm-course

    mlabonne/llm-course

    75,340GitHubView on GitHub↗

    This project is a comprehensive educational curriculum and engineering handbook focused on the lifecycle of large language models. It serves as a structured knowledge base for machine learning practitioners, covering the fundamental mathematical and architectural principles of transformer-based sequence modeling, as we

    Implements efficient attention mechanisms and optimization strategies to maximize inference throughput.

    courselarge-language-modelsllm
  • infiniflow/ragflow

    infiniflow/ragflow

    73,425GitHubView on GitHub↗

    This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasonin

    Processes unstructured data using deep document understanding to extract structured knowledge for high-quality information retrieval.

    Pythonagentagenticagentic-ai
  • PaddlePaddle/PaddleOCR

    PaddlePaddle/PaddleOCR

    70,931GitHubView on GitHub↗

    PaddleOCR is a comprehensive optical character recognition framework designed for detecting and transcribing text from images and documents into structured, machine-readable formats. It provides a modular computer vision pipeline that decouples image preprocessing, text detection, and character recognition into indepen

    Facilitates the deployment of text extraction models as scalable services across various hardware environments.

    Pythonai4sciencechineseocrdocument-parsing
  • vllm-project/vllm

    vllm-project/vllm

    70,745GitHubView on GitHub↗

    vLLM is a high-throughput inference engine designed for the efficient serving and execution of large language models. It functions as a production-ready distributed model server, providing standard API protocols for online serving while also supporting offline batch processing. The system is built to maximize token gen

    Enables execution of advanced generative models directly on local hardware for private and low-latency inference.

    Pythonamdblackwellcuda
  • dair-ai/Prompt-Engineering-Guide

    dair-ai/Prompt-Engineering-Guide

    70,526GitHubView on GitHub↗

    This project is a comprehensive educational resource and knowledge base dedicated to the development and application of large language models and autonomous agentic systems. It provides a structured framework for understanding prompt engineering, context management, and the architectural patterns required to build task

    Demonstrates essential setup procedures for connecting to and configuring external language model providers.

    MDXagentagentsai-agents
  • hiyouga/LlamaFactory

    hiyouga/LlamaFactory

    67,386GitHubView on GitHub↗

    LlamaFactory is a unified framework for fine-tuning and adapting large language models. It provides a comprehensive platform that standardizes training workflows across diverse machine learning architectures, allowing users to execute both full-tuning and parameter-efficient methods through a single interface. The pro

    Wraps model execution in a web-accessible interface to provide consistent endpoints for client-side requests.

    Pythonagentaideepseek
  • meta-llama/llama

    meta-llama/llama

    59,157GitHubView on GitHub↗

    Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on

    Executes model checkpoints locally with configurable parameters like sequence length and batch size to optimize performance.

    Python
  • zylon-ai/private-gpt

    zylon-ai/private-gpt

    57,116GitHubView on GitHub↗

    This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to prov

    Runs generative language models directly on local hardware for private, offline processing tasks.

    Python
  • ultralytics/yolov5

    ultralytics/yolov5

    56,830GitHubView on GitHub↗

    YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning

    Executes high-speed visual inference using hardware-accelerated processing and test-time augmentation.

    Pythoncoremldeep-learningios
Prev12Next

Explore sub-tags

  • Deployment Pipelines and Endpoints5 sub-tags
  • Inference Optimization and Tuning6 sub-tags
  • Inference Servers and Runtimes14 sub-tags
  • Knowledge Retrieval and Documents3 sub-tags
Licensing and Citations2 sub-tags
  • Local and On-Device Inference8 sub-tags
  • Model Hubs and Pre-made Models4 sub-tags
  • Serialization and Export Formats6 sub-tags