What are the best open-source alternatives to Serving?

30 open-source projects similar to paddlepaddle/serving, ranked by shared features. Top picks: seldonio/seldon-core, zhaochenyang20/awesome-ml-sys-tutorial, kserve/kserve, openvinotoolkit/openvino, kubeflow/kfserving, triton-inference-server/server, intel-analytics/bigdl, pytorch/serve, nvidia/triton-inference-server, paddlepaddle/paddlex.

Is seldonio/seldon-core a good alternative to Serving?

Seldon Core is a Kubernetes-based machine learning model server and MLOps inference framework. It functions as a multi-model serving engine and pipeline orchestrator, packaging models as scalable microservices that are exposed via standardized REST and gRPC APIs. The project distinguishes itself t…

Is zhaochenyang20/awesome-ml-sys-tutorial a good alternative to Serving?

This project provides a comprehensive technical guide and framework for engineering large-scale machine learning systems. It covers the full lifecycle of model development, focusing on the infrastructure and computational principles required to build, train, and serve generative AI models across di…

Is kserve/kserve a good alternative to Serving?

KServe is a Kubernetes-native platform for deploying and serving machine learning models as scalable inference services. It supports both generative AI models, including large language models, and traditional predictive models from frameworks such as TensorFlow, PyTorch, Scikit-Learn, XGBoost, and…

Is openvinotoolkit/openvino a good alternative to Serving?

OpenVINO is an AI inference engine and model serving platform designed to execute optimized deep learning models across CPUs, GPUs, and NPUs through a unified API. It includes a model optimization toolkit for converting, quantizing, and compressing models from various frameworks, alongside a specia…

Is kubeflow/kfserving a good alternative to Serving?

KServe is an open platform for deploying and serving generative and predictive AI models on Kubernetes. It defines inference services as custom resources with declarative YAML specifications, enabling a Kubernetes-native approach to model deployment and lifecycle management. The platform leverages…

Is triton-inference-server/server a good alternative to Serving?

Triton Inference Server is a high-performance server designed to deploy machine learning models from multiple frameworks across GPUs and CPUs. It functions as a hardware-accelerated inference engine and a gRPC inference gateway, providing a standardized communication layer for transmitting binary t…

Is intel-analytics/bigdl a good alternative to Serving?

BigDL is a PyTorch acceleration framework and distributed inference engine designed for large language models. It provides a toolkit for running models on Intel hardware, integrating quantization tools and libraries for parameter-efficient fine-tuning. The project distinguishes itself through the…

Is pytorch/serve a good alternative to Serving?

This project is a PyTorch model serving framework designed to deploy and scale machine learning models in production via scalable network endpoints. It functions as a high-performance inference server, optimizer, and model lifecycle manager that handles model loading, request batching, and hardware…

Is nvidia/triton-inference-server a good alternative to Serving?

Triton Inference Server is a high-performance AI model inference server and multi-framework model runtime designed for deploying machine learning models across cloud, data center, and embedded edge infrastructure. It serves as an execution engine that allows for the concurrent running of models fro…

Is paddlepaddle/paddlex a good alternative to Serving?

PaddleX is a PaddlePaddle-based framework for building, deploying, and fine-tuning AI model pipelines, with pre-built support for computer vision, OCR, document analysis, and time series tasks. It offers a toolkit of ready-to-use pipelines for image classification, object detection, segmentation, a…

Back to paddlepaddle/serving

Open-source alternatives to Serving

30 open-source projects similar to paddlepaddle/serving, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Serving alternative.

seldonio/seldon-core
SeldonIO/seldon-core
4,752View on GitHub
Seldon Core is a Kubernetes-based machine learning model server and MLOps inference framework. It functions as a multi-model serving engine and pipeline orchestrator, packaging models as scalable microservices that are exposed via standardized REST and gRPC APIs. The project distinguishes itself through graph-based inference pipelines that chain models and data transformers into sequential workflows. It optimizes hardware utilization via multi-model shared serving and dynamic memory overcommit strategies, while supporting production experimentation through weighted traffic routing, A/B testin
Goaiopsdeploymentkubernetes
View on GitHub4,752
zhaochenyang20/awesome-ml-sys-tutorial
zhaochenyang20/Awesome-ML-SYS-Tutorial
5,371View on GitHub
This project provides a comprehensive technical guide and framework for engineering large-scale machine learning systems. It covers the full lifecycle of model development, focusing on the infrastructure and computational principles required to build, train, and serve generative AI models across distributed GPU clusters. The repository distinguishes itself by offering deep-dive tutorials and implementation strategies for complex system challenges. It emphasizes high-performance architectural primitives, such as collective communication orchestration, distributed tensor sharding, and static gr
Python
View on GitHub5,371
kserve/kserve
kserve/kserve
5,576View on GitHub
KServe is a Kubernetes-native platform for deploying and serving machine learning models as scalable inference services. It supports both generative AI models, including large language models, and traditional predictive models from frameworks such as TensorFlow, PyTorch, Scikit-Learn, XGBoost, and ONNX. The platform manages the full lifecycle of model deployments, including revision tracking, canary rollouts, A/B testing, and automatic rollbacks, and provides serverless scale-to-zero capabilities for cost-efficient resource management. KServe distinguishes itself through a standardized infere
Go
View on GitHub5,576

Open-source alternatives to Serving

SeldonIO/seldon-core

zhaochenyang20/Awesome-ML-SYS-Tutorial

kserve/kserve

openvinotoolkit/openvino

kubeflow/kfserving

triton-inference-server/server

intel-analytics/BigDL

pytorch/serve

NVIDIA/triton-inference-server

PaddlePaddle/PaddleX

sgl-project/sglang

gpustack/gpustack

hazelcast/hazelcast

tensorflow/serving

huggingface/text-generation-inference

bentoml/BentoML

kubeflow/kubeflow

OpenRLHF/OpenRLHF

intel-analytics/ipex-llm

openmlsys/openmlsys

DataTalksClub/machine-learning-zoomcamp

flyteorg/flyte

Azure/mmlspark

apache/beam

xorbitsai/inference

skyzh/tiny-llm

maiot-io/zenml

facebookresearch/metaseq

alibaba/x-deeplearning

microsoft/SynapseML