25 repos

Awesome GitHub RepositoriesDeployment & Serving

Explore 25 awesome GitHub repositories matching artificial intelligence & ml · Deployment & Serving. Refine with filters or upvote what's useful.

We'll search the best matching repositories with AI.

tensorflow/tensorflow
tensorflow/tensorflow
193,864GitHubView on GitHub
TensorFlow is a comprehensive machine learning framework designed for the construction, training, and deployment of complex mathematical models. It utilizes a graph-based execution model that represents operations as directed acyclic graphs, enabling automatic differentiation and efficient parallel processing. The syst
Standardizes the toolchain for serializing, optimizing, and serving machine learning models within high-performance production environments.
C++deep-learningdeep-neural-networksdistributed
AUTOMATIC1111/stable-diffusion-webui
AUTOMATIC1111/stable-diffusion-webui
160,701GitHubView on GitHub
Stable Diffusion Web UI is a browser-based interface designed for managing text-to-image generation tasks. It provides a centralized dashboard for controlling generative processes, including native support for multi-stage model architectures to facilitate high-quality image refinement. The platform distinguishes itsel
Walks through the configuration steps required to run the application within the Windows Subsystem for Linux.
Pythonaiai-artdeep-learning
huggingface/transformers
huggingface/transformers
156,730GitHubView on GitHub
Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering
Exports models into a portable format with ahead-of-time memory planning and hardware-specific operation dispatch for edge device inference.
Pythonaudiodeep-learningdeepseek
Comfy-Org/ComfyUI
Comfy-Org/ComfyUI
103,654GitHubView on GitHub
ComfyUI is a node-based generative AI orchestration engine designed for constructing, testing, and executing complex image and video synthesis pipelines. By utilizing a directed acyclic graph execution model, the platform allows users to build reproducible workflows through modular, interconnected processing blocks wit
Serves visual, node-based generative pipelines as programmable API endpoints for integration into external software.
Pythonaicomfycomfyui
deepseek-ai/DeepSeek-V3
deepseek-ai/DeepSeek-V3
101,631GitHubView on GitHub
DeepSeek-V3 is a large language model that provides comprehensive resources for model utilization, including technical specifications, pre-trained weights, and evaluation benchmarks. The project details the core transformer architecture, including parameter counts and multi-token prediction modules, while supporting na
Downloadable parameter files and technical configurations enable direct integration of the pre-trained model into custom environments.
Python
ggml-org/llama.cpp
ggml-org/llama.cpp
95,400GitHubView on GitHub
Llama.cpp is an inference engine designed for the local execution of text-based and multimodal language models on consumer hardware. It provides a core environment for running models that process both text and image inputs, utilizing hardware-accelerated backends to optimize performance across diverse CPU and GPU archi
Executes large language models locally on standard consumer hardware with high performance.
C++ggml
hacksider/Deep-Live-Cam
hacksider/Deep-Live-Cam
79,568GitHubView on GitHub
Deep-Live-Cam is a generative video transformation tool designed for real-time facial manipulation and cinematic enhancement. It functions as a local-first AI runtime, performing all media processing directly on the user's hardware to ensure complete data privacy without external network dependencies. By utilizing a hi
Optimizes generative models for low-latency, real-time inference on consumer-grade hardware.
Pythonaiai-deep-fakeai-face
browser-use/browser-use
browser-use/browser-use
78,576GitHubView on GitHub
Browser-use is a framework for building autonomous agents that navigate, interact with, and extract data from web interfaces using natural language instructions. By acting as an orchestration layer between large language models and browser automation protocols, it enables the execution of complex, multi-step workflows
Adjusts operational behavior and inference parameters for Llama models to optimize their performance in web-based reasoning tasks.
Pythonai-agentsai-toolsbrowser-automation
hoppscotch/hoppscotch
hoppscotch/hoppscotch
77,888GitHubView on GitHub
Hoppscotch is an open-source API development ecosystem designed for building, testing, and debugging REST, GraphQL, and real-time APIs. It provides a unified platform that functions across web browsers, desktop applications, and command-line interfaces, allowing developers to manage the entire API lifecycle from a sing
Configures AI-driven assistance to generate payloads and automate test script creation.
TypeScriptapiapi-clientapi-rest
nomic-ai/gpt4all
nomic-ai/gpt4all
77,146GitHubView on GitHub
GPT4All is a cross-platform runtime environment designed to execute large language models directly on local consumer hardware. By leveraging an optimized C++ inference backend, it enables private, offline AI interactions without requiring an internet connection or external cloud services. The project provides a compreh
Enables private, offline inference by running large language models directly on local hardware resources.
C++ai-chatllm-inference
zed-industries/zed
zed-industries/zed
75,634GitHubView on GitHub
Zed is an AI-native, high-performance code editor designed for extreme responsiveness and keyboard-centric workflows. It functions as an extensible text processing workspace that integrates autonomous agents and predictive models directly into the development environment to automate complex engineering tasks, refactori
Runs machine learning models on local hardware to ensure data privacy and reduce latency for AI-assisted coding tasks.
Rustgpuirust-langtext-editor
mlabonne/llm-course
mlabonne/llm-course
75,340GitHubView on GitHub
This project is a comprehensive educational curriculum and engineering handbook focused on the lifecycle of large language models. It serves as a structured knowledge base for machine learning practitioners, covering the fundamental mathematical and architectural principles of transformer-based sequence modeling, as we
Implements efficient attention mechanisms and optimization strategies to maximize inference throughput.
courselarge-language-modelsllm
infiniflow/ragflow
infiniflow/ragflow
73,425GitHubView on GitHub
This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasonin
Processes unstructured data using deep document understanding to extract structured knowledge for high-quality information retrieval.
Pythonagentagenticagentic-ai
PaddlePaddle/PaddleOCR
PaddlePaddle/PaddleOCR
70,931GitHubView on GitHub
PaddleOCR is a comprehensive optical character recognition framework designed for detecting and transcribing text from images and documents into structured, machine-readable formats. It provides a modular computer vision pipeline that decouples image preprocessing, text detection, and character recognition into indepen
Facilitates the deployment of text extraction models as scalable services across various hardware environments.
Pythonai4sciencechineseocrdocument-parsing
vllm-project/vllm
vllm-project/vllm
70,745GitHubView on GitHub
vLLM is a high-throughput inference engine designed for the efficient serving and execution of large language models. It functions as a production-ready distributed model server, providing standard API protocols for online serving while also supporting offline batch processing. The system is built to maximize token gen
Enables execution of advanced generative models directly on local hardware for private and low-latency inference.
Pythonamdblackwellcuda
dair-ai/Prompt-Engineering-Guide
dair-ai/Prompt-Engineering-Guide
70,526GitHubView on GitHub
This project is a comprehensive educational resource and knowledge base dedicated to the development and application of large language models and autonomous agentic systems. It provides a structured framework for understanding prompt engineering, context management, and the architectural patterns required to build task
Demonstrates essential setup procedures for connecting to and configuring external language model providers.
MDXagentagentsai-agents
hiyouga/LlamaFactory
hiyouga/LlamaFactory
67,386GitHubView on GitHub
LlamaFactory is a unified framework for fine-tuning and adapting large language models. It provides a comprehensive platform that standardizes training workflows across diverse machine learning architectures, allowing users to execute both full-tuning and parameter-efficient methods through a single interface. The pro
Wraps model execution in a web-accessible interface to provide consistent endpoints for client-side requests.
Pythonagentaideepseek
meta-llama/llama
meta-llama/llama
59,157GitHubView on GitHub
Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on
Executes model checkpoints locally with configurable parameters like sequence length and batch size to optimize performance.
Python
zylon-ai/private-gpt
zylon-ai/private-gpt
57,116GitHubView on GitHub
This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to prov
Runs generative language models directly on local hardware for private, offline processing tasks.
Python
ultralytics/yolov5
ultralytics/yolov5
56,830GitHubView on GitHub
YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning
Executes high-speed visual inference using hardware-accelerated processing and test-time augmentation.
Pythoncoremldeep-learningios

Explore sub-tags

25 repos

Awesome GitHub RepositoriesDeployment & Serving

Explore 25 awesome GitHub repositories matching artificial intelligence & ml · Deployment & Serving. Refine with filters or upvote what's useful.

We'll search the best matching repositories with AI.

tensorflow/tensorflow
tensorflow/tensorflow
193,864GitHubView on GitHub
TensorFlow is a comprehensive machine learning framework designed for the construction, training, and deployment of complex mathematical models. It utilizes a graph-based execution model that represents operations as directed acyclic graphs, enabling automatic differentiation and efficient parallel processing. The syst
Standardizes the toolchain for serializing, optimizing, and serving machine learning models within high-performance production environments.
C++deep-learningdeep-neural-networksdistributed
AUTOMATIC1111/stable-diffusion-webui
AUTOMATIC1111/stable-diffusion-webui
160,701GitHubView on GitHub
Stable Diffusion Web UI is a browser-based interface designed for managing text-to-image generation tasks. It provides a centralized dashboard for controlling generative processes, including native support for multi-stage model architectures to facilitate high-quality image refinement. The platform distinguishes itsel
Walks through the configuration steps required to run the application within the Windows Subsystem for Linux.
Pythonaiai-artdeep-learning
huggingface/transformers
huggingface/transformers
156,730GitHubView on GitHub
Transformers is a comprehensive library for machine learning that provides a unified interface for training, fine-tuning, and deploying transformer-based models. It supports a wide range of tasks, including text classification, language modeling, question answering, and sequence-to-sequence translation, while offering
Exports models into a portable format with ahead-of-time memory planning and hardware-specific operation dispatch for edge device inference.
Pythonaudiodeep-learningdeepseek
Comfy-Org/ComfyUI
Comfy-Org/ComfyUI
103,654GitHubView on GitHub
ComfyUI is a node-based generative AI orchestration engine designed for constructing, testing, and executing complex image and video synthesis pipelines. By utilizing a directed acyclic graph execution model, the platform allows users to build reproducible workflows through modular, interconnected processing blocks wit
Serves visual, node-based generative pipelines as programmable API endpoints for integration into external software.
Pythonaicomfycomfyui
deepseek-ai/DeepSeek-V3
deepseek-ai/DeepSeek-V3
101,631GitHubView on GitHub
DeepSeek-V3 is a large language model that provides comprehensive resources for model utilization, including technical specifications, pre-trained weights, and evaluation benchmarks. The project details the core transformer architecture, including parameter counts and multi-token prediction modules, while supporting na
Downloadable parameter files and technical configurations enable direct integration of the pre-trained model into custom environments.
Python
ggml-org/llama.cpp
ggml-org/llama.cpp
95,400GitHubView on GitHub
Llama.cpp is an inference engine designed for the local execution of text-based and multimodal language models on consumer hardware. It provides a core environment for running models that process both text and image inputs, utilizing hardware-accelerated backends to optimize performance across diverse CPU and GPU archi
Executes large language models locally on standard consumer hardware with high performance.
C++ggml
hacksider/Deep-Live-Cam
hacksider/Deep-Live-Cam
79,568GitHubView on GitHub
Deep-Live-Cam is a generative video transformation tool designed for real-time facial manipulation and cinematic enhancement. It functions as a local-first AI runtime, performing all media processing directly on the user's hardware to ensure complete data privacy without external network dependencies. By utilizing a hi
Optimizes generative models for low-latency, real-time inference on consumer-grade hardware.
Pythonaiai-deep-fakeai-face
browser-use/browser-use
browser-use/browser-use
78,576GitHubView on GitHub
Browser-use is a framework for building autonomous agents that navigate, interact with, and extract data from web interfaces using natural language instructions. By acting as an orchestration layer between large language models and browser automation protocols, it enables the execution of complex, multi-step workflows
Adjusts operational behavior and inference parameters for Llama models to optimize their performance in web-based reasoning tasks.
Pythonai-agentsai-toolsbrowser-automation
hoppscotch/hoppscotch
hoppscotch/hoppscotch
77,888GitHubView on GitHub
Hoppscotch is an open-source API development ecosystem designed for building, testing, and debugging REST, GraphQL, and real-time APIs. It provides a unified platform that functions across web browsers, desktop applications, and command-line interfaces, allowing developers to manage the entire API lifecycle from a sing
Configures AI-driven assistance to generate payloads and automate test script creation.
TypeScriptapiapi-clientapi-rest
nomic-ai/gpt4all
nomic-ai/gpt4all
77,146GitHubView on GitHub
GPT4All is a cross-platform runtime environment designed to execute large language models directly on local consumer hardware. By leveraging an optimized C++ inference backend, it enables private, offline AI interactions without requiring an internet connection or external cloud services. The project provides a compreh
Enables private, offline inference by running large language models directly on local hardware resources.
C++ai-chatllm-inference
zed-industries/zed
zed-industries/zed
75,634GitHubView on GitHub
Zed is an AI-native, high-performance code editor designed for extreme responsiveness and keyboard-centric workflows. It functions as an extensible text processing workspace that integrates autonomous agents and predictive models directly into the development environment to automate complex engineering tasks, refactori
Runs machine learning models on local hardware to ensure data privacy and reduce latency for AI-assisted coding tasks.
Rustgpuirust-langtext-editor
mlabonne/llm-course
mlabonne/llm-course
75,340GitHubView on GitHub
This project is a comprehensive educational curriculum and engineering handbook focused on the lifecycle of large language models. It serves as a structured knowledge base for machine learning practitioners, covering the fundamental mathematical and architectural principles of transformer-based sequence modeling, as we
Implements efficient attention mechanisms and optimization strategies to maximize inference throughput.
courselarge-language-modelsllm
infiniflow/ragflow
infiniflow/ragflow
73,425GitHubView on GitHub
This project is a comprehensive retrieval-augmented generation platform designed for building, managing, and deploying knowledge-based AI applications. It provides a unified environment for organizing datasets, configuring conversational chat assistants, and developing autonomous agents that execute multi-step reasonin
Processes unstructured data using deep document understanding to extract structured knowledge for high-quality information retrieval.
Pythonagentagenticagentic-ai
PaddlePaddle/PaddleOCR
PaddlePaddle/PaddleOCR
70,931GitHubView on GitHub
PaddleOCR is a comprehensive optical character recognition framework designed for detecting and transcribing text from images and documents into structured, machine-readable formats. It provides a modular computer vision pipeline that decouples image preprocessing, text detection, and character recognition into indepen
Facilitates the deployment of text extraction models as scalable services across various hardware environments.
Pythonai4sciencechineseocrdocument-parsing
vllm-project/vllm
vllm-project/vllm
70,745GitHubView on GitHub
vLLM is a high-throughput inference engine designed for the efficient serving and execution of large language models. It functions as a production-ready distributed model server, providing standard API protocols for online serving while also supporting offline batch processing. The system is built to maximize token gen
Enables execution of advanced generative models directly on local hardware for private and low-latency inference.
Pythonamdblackwellcuda
dair-ai/Prompt-Engineering-Guide
dair-ai/Prompt-Engineering-Guide
70,526GitHubView on GitHub
This project is a comprehensive educational resource and knowledge base dedicated to the development and application of large language models and autonomous agentic systems. It provides a structured framework for understanding prompt engineering, context management, and the architectural patterns required to build task
Demonstrates essential setup procedures for connecting to and configuring external language model providers.
MDXagentagentsai-agents
hiyouga/LlamaFactory
hiyouga/LlamaFactory
67,386GitHubView on GitHub
LlamaFactory is a unified framework for fine-tuning and adapting large language models. It provides a comprehensive platform that standardizes training workflows across diverse machine learning architectures, allowing users to execute both full-tuning and parameter-efficient methods through a single interface. The pro
Wraps model execution in a web-accessible interface to provide consistent endpoints for client-side requests.
Pythonagentaideepseek
meta-llama/llama
meta-llama/llama
59,157GitHubView on GitHub
Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on
Executes model checkpoints locally with configurable parameters like sequence length and batch size to optimize performance.
Python
zylon-ai/private-gpt
zylon-ai/private-gpt
57,116GitHubView on GitHub
This project is a privacy-first backend service designed to facilitate retrieval-augmented generation by processing local documents into searchable vector representations. It provides a modular architecture that allows users to ingest diverse file formats, manage document metadata, and perform semantic searches to prov
Runs generative language models directly on local hardware for private, offline processing tasks.
Python
ultralytics/yolov5
ultralytics/yolov5
56,830GitHubView on GitHub
YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning
Executes high-speed visual inference using hardware-accelerated processing and test-time augmentation.
Pythoncoremldeep-learningios

Awesome Deployment & Serving GitHub Repositories

tensorflow/tensorflow

AUTOMATIC1111/stable-diffusion-webui

huggingface/transformers

Comfy-Org/ComfyUI

deepseek-ai/DeepSeek-V3

ggml-org/llama.cpp

hacksider/Deep-Live-Cam

browser-use/browser-use

hoppscotch/hoppscotch

nomic-ai/gpt4all

zed-industries/zed

mlabonne/llm-course

infiniflow/ragflow

PaddlePaddle/PaddleOCR

vllm-project/vllm

dair-ai/Prompt-Engineering-Guide

hiyouga/LlamaFactory

meta-llama/llama

zylon-ai/private-gpt

ultralytics/yolov5

Explore sub-tags

Awesome Deployment & Serving GitHub Repositories

tensorflow/tensorflow

AUTOMATIC1111/stable-diffusion-webui

huggingface/transformers

Comfy-Org/ComfyUI

deepseek-ai/DeepSeek-V3

ggml-org/llama.cpp

hacksider/Deep-Live-Cam

browser-use/browser-use

hoppscotch/hoppscotch

nomic-ai/gpt4all

zed-industries/zed

mlabonne/llm-course

infiniflow/ragflow

PaddlePaddle/PaddleOCR

vllm-project/vllm

dair-ai/Prompt-Engineering-Guide

hiyouga/LlamaFactory

meta-llama/llama

zylon-ai/private-gpt

ultralytics/yolov5

Explore sub-tags