32 repos

Awesome GitHub RepositoriesModel Inference and Serving

Platforms and techniques for deploying, optimizing, and serving machine learning models for production use.

Explore 32 awesome GitHub repositories matching artificial intelligence & ml · Model Inference and Serving. Refine with filters or upvote what's useful.

We'll search the best matching repositories with AI.

keras-team/keras
keras-team/keras
63,858GitHubView on GitHub
Keras is a high-level deep learning framework designed for constructing and training neural networks through the composition of modular, functional layers. It serves as a comprehensive modeling toolkit that provides standardized procedures for defining, evaluating, and deploying complex architectures. By utilizing a di
Exposes unified interfaces to switch between various computational backends for consistent model execution.
Pythondata-sciencedeep-learningjax
traefik/traefik
traefik/traefik
61,814GitHubView on GitHub
Traefik is a cloud-native edge router and API gateway designed to manage service communication and traffic flow across distributed infrastructure. It functions as a dynamic service proxy that automatically discovers backend services and configures routing rules in real time, eliminating the need for manual restarts or
Caches model responses based on query semantics to minimize redundant computation and lower inference latency.
Goconsuldockeretcd
meta-llama/llama
meta-llama/llama
59,157GitHubView on GitHub
Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on
Reduces numerical precision in model weights to lower memory footprint and accelerate inference on local devices.
Python
cline/cline
cline/cline
58,164GitHubView on GitHub
Cline is an extensible agent runtime and multi-agent orchestration engine designed to automate complex software engineering workflows. It functions as an integrated development environment extension that bridges strategic task planning with autonomous execution, allowing users to manage multi-step projects through huma
Connects various local and cloud-based language models to facilitate automated software engineering workflows.
TypeScript
ultralytics/yolov5
ultralytics/yolov5
56,830GitHubView on GitHub
YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning
Decreases model size and improves execution speed by setting a specific percentage of weights to zero.
Pythoncoremldeep-learningios
AntonOsika/gpt-engineer
AntonOsika/gpt-engineer
55,201GitHubView on GitHub
GPT-Engineer is an autonomous agent and framework designed for AI-assisted software development. It functions as a generative codebase architect that translates natural language requirements into complete, functional software projects by reading and writing files directly to the local file system. The platform disting
Supports the deployment and integration of various local and cloud-based language models for generative tasks.
Pythonaiautonomous-agentcode-generation
Mintplex-Labs/anything-llm
Mintplex-Labs/anything-llm
54,751GitHubView on GitHub
This platform serves as a comprehensive environment for managing private language models, document knowledge bases, and automated agent workflows within secure local infrastructure. It functions as a document-aware workspace that enables users to ingest diverse file formats into searchable repositories, ensuring that a
Deploys language model interfaces and data processing engines directly onto local hardware for private, self-hosted operations.
JavaScriptai-agentscustom-ai-agentsdeepseek
karpathy/nanoGPT
karpathy/nanoGPT
53,461GitHubView on GitHub
nanoGPT is a lightweight engine for training and fine-tuning transformer-based language models from scratch. It provides a minimalist codebase designed for educational exploration and rapid experimentation with neural network architectures, utilizing self-attention and feed-forward layers to process sequences and predi
Exposes a command-line interface for sampling text sequences with adjustable generation settings.
Python
ultralytics/ultralytics
ultralytics/ultralytics
53,426GitHubView on GitHub
Ultralytics is a comprehensive computer vision framework designed for training, validating, and deploying deep learning models across a wide range of visual recognition tasks. It provides a unified interface for core operations including object detection, instance segmentation, pose estimation, and image classification
Parses and structures raw model outputs into usable formats like bounding boxes, masks, and keypoint coordinates.
Pythonclicomputer-visiondeep-learning
facebookresearch/segment-anything
facebookresearch/segment-anything
53,431GitHubView on GitHub
This project provides a deep learning architecture designed to identify and isolate distinct objects within images by generating precise pixel-level masks. It functions as a browser-based inference engine, enabling the execution of complex machine learning models directly within web environments without requiring serve
Enables the execution of sophisticated deep learning models directly within the browser environment using hardware-accelerated runtimes.
Jupyter Notebook
unslothai/unsloth
unslothai/unsloth
52,461GitHubView on GitHub
Unsloth is a high-performance training and inference platform designed to optimize the lifecycle of large language and multimodal models. It provides a comprehensive engine for fine-tuning, executing, and managing models locally, with a focus on reducing memory consumption and increasing compute speed on consumer-grade
Reduces memory usage and increases processing speed during the fine-tuning of large models for specific applications.
Pythonagentdeepseekdeepseek-r1
tensorflow/tfjs-examples
tensorflow/tfjs-examples
6,783GitHubView on GitHub
This repository provides a collection of practical demonstrations and implementation guides for machine learning tasks using TensorFlow.js. It serves as a resource for developers to explore model architectures, training workflows, and data manipulation techniques across domains such as computer vision, natural language
Backend-specific kernels register optimized logic for operations, enabling efficient memory access and dispatch during execution.
JavaScript

Explore sub-tags

32 repos

Awesome GitHub RepositoriesModel Inference and Serving

Platforms and techniques for deploying, optimizing, and serving machine learning models for production use.

Explore 32 awesome GitHub repositories matching artificial intelligence & ml · Model Inference and Serving. Refine with filters or upvote what's useful.

We'll search the best matching repositories with AI.

keras-team/keras
keras-team/keras
63,858GitHubView on GitHub
Keras is a high-level deep learning framework designed for constructing and training neural networks through the composition of modular, functional layers. It serves as a comprehensive modeling toolkit that provides standardized procedures for defining, evaluating, and deploying complex architectures. By utilizing a di
Exposes unified interfaces to switch between various computational backends for consistent model execution.
Pythondata-sciencedeep-learningjax
traefik/traefik
traefik/traefik
61,814GitHubView on GitHub
Traefik is a cloud-native edge router and API gateway designed to manage service communication and traffic flow across distributed infrastructure. It functions as a dynamic service proxy that automatically discovers backend services and configures routing rules in real time, eliminating the need for manual restarts or
Caches model responses based on query semantics to minimize redundant computation and lower inference latency.
Goconsuldockeretcd
meta-llama/llama
meta-llama/llama
59,157GitHubView on GitHub
Llama is a computational framework and runtime environment designed for executing transformer-based neural networks locally. It functions as a generative AI inference engine, enabling the processing of input sequences through pre-trained model weights to produce text completions and structured data outputs directly on
Reduces numerical precision in model weights to lower memory footprint and accelerate inference on local devices.
Python
cline/cline
cline/cline
58,164GitHubView on GitHub
Cline is an extensible agent runtime and multi-agent orchestration engine designed to automate complex software engineering workflows. It functions as an integrated development environment extension that bridges strategic task planning with autonomous execution, allowing users to manage multi-step projects through huma
Connects various local and cloud-based language models to facilitate automated software engineering workflows.
TypeScript
ultralytics/yolov5
ultralytics/yolov5
56,830GitHubView on GitHub
YOLOv5 is a comprehensive computer vision framework designed for end-to-end deep learning, specializing in real-time object detection, image classification, and instance segmentation. It provides a unified toolkit that manages the entire lifecycle of a model, from initial dataset configuration and hyperparameter tuning
Decreases model size and improves execution speed by setting a specific percentage of weights to zero.
Pythoncoremldeep-learningios
AntonOsika/gpt-engineer
AntonOsika/gpt-engineer
55,201GitHubView on GitHub
GPT-Engineer is an autonomous agent and framework designed for AI-assisted software development. It functions as a generative codebase architect that translates natural language requirements into complete, functional software projects by reading and writing files directly to the local file system. The platform disting
Supports the deployment and integration of various local and cloud-based language models for generative tasks.
Pythonaiautonomous-agentcode-generation
Mintplex-Labs/anything-llm
Mintplex-Labs/anything-llm
54,751GitHubView on GitHub
This platform serves as a comprehensive environment for managing private language models, document knowledge bases, and automated agent workflows within secure local infrastructure. It functions as a document-aware workspace that enables users to ingest diverse file formats into searchable repositories, ensuring that a
Deploys language model interfaces and data processing engines directly onto local hardware for private, self-hosted operations.
JavaScriptai-agentscustom-ai-agentsdeepseek
karpathy/nanoGPT
karpathy/nanoGPT
53,461GitHubView on GitHub
nanoGPT is a lightweight engine for training and fine-tuning transformer-based language models from scratch. It provides a minimalist codebase designed for educational exploration and rapid experimentation with neural network architectures, utilizing self-attention and feed-forward layers to process sequences and predi
Exposes a command-line interface for sampling text sequences with adjustable generation settings.
Python
ultralytics/ultralytics
ultralytics/ultralytics
53,426GitHubView on GitHub
Ultralytics is a comprehensive computer vision framework designed for training, validating, and deploying deep learning models across a wide range of visual recognition tasks. It provides a unified interface for core operations including object detection, instance segmentation, pose estimation, and image classification
Parses and structures raw model outputs into usable formats like bounding boxes, masks, and keypoint coordinates.
Pythonclicomputer-visiondeep-learning
facebookresearch/segment-anything
facebookresearch/segment-anything
53,431GitHubView on GitHub
This project provides a deep learning architecture designed to identify and isolate distinct objects within images by generating precise pixel-level masks. It functions as a browser-based inference engine, enabling the execution of complex machine learning models directly within web environments without requiring serve
Enables the execution of sophisticated deep learning models directly within the browser environment using hardware-accelerated runtimes.
Jupyter Notebook
unslothai/unsloth
unslothai/unsloth
52,461GitHubView on GitHub
Unsloth is a high-performance training and inference platform designed to optimize the lifecycle of large language and multimodal models. It provides a comprehensive engine for fine-tuning, executing, and managing models locally, with a focus on reducing memory consumption and increasing compute speed on consumer-grade
Reduces memory usage and increases processing speed during the fine-tuning of large models for specific applications.
Pythonagentdeepseekdeepseek-r1
tensorflow/tfjs-examples
tensorflow/tfjs-examples
6,783GitHubView on GitHub
This repository provides a collection of practical demonstrations and implementation guides for machine learning tasks using TensorFlow.js. It serves as a resource for developers to explore model architectures, training workflows, and data manipulation techniques across domains such as computer vision, natural language
Backend-specific kernels register optimized logic for operations, enabling efficient memory access and dispatch during execution.
JavaScript

Awesome Model Inference and Serving GitHub Repositories

keras-team/keras

traefik/traefik

meta-llama/llama

cline/cline

ultralytics/yolov5

AntonOsika/gpt-engineer

Mintplex-Labs/anything-llm

karpathy/nanoGPT

ultralytics/ultralytics

facebookresearch/segment-anything

unslothai/unsloth

tensorflow/tfjs-examples

Explore sub-tags

Awesome Model Inference and Serving GitHub Repositories

keras-team/keras

traefik/traefik

meta-llama/llama

cline/cline

ultralytics/yolov5

AntonOsika/gpt-engineer

Mintplex-Labs/anything-llm

karpathy/nanoGPT

ultralytics/ultralytics

facebookresearch/segment-anything

unslothai/unsloth

tensorflow/tfjs-examples

Explore sub-tags