KServe is a Kubernetes-native platform for deploying and serving machine learning models as scalable inference services. It supports both generative AI models, including large language models, and traditional predictive models from frameworks such as TensorFlow, PyTorch, Scikit-Learn, XGBoost, and ONNX. The platform manages the full lifecycle of model deployments, including revision tracking, canary rollouts, A/B testing, and automatic rollbacks, and provides serverless scale-to-zero capabilities for cost-efficient resource management. KServe distinguishes itself through a standardized infere
Kubeless is a Kubernetes-native serverless framework that deploys and runs stateless functions as custom resources managed by an in-cluster controller. It functions as a Function-as-a-Service platform, launching function runtime pods on demand and scaling them to zero when idle to optimise resource usage. Functions are invoked automatically through HTTP requests or a publish-subscribe messaging bus, enabling event-driven execution for workloads and microservices. The platform supports running functions written in Golang, Python, Node.js, Ruby, PHP, .NET, and Ballerina, with the ability to add
KServe is an open platform for deploying and serving generative and predictive AI models on Kubernetes. It defines inference services as custom resources with declarative YAML specifications, enabling a Kubernetes-native approach to model deployment and lifecycle management. The platform leverages Knative-based serverless scaling for automatic scale-to-zero and revision management, and supports a pluggable serving runtime architecture that maps model formats to containerized execution environments. KServe distinguishes itself through model-aware autoscaling that scales replicas based on token
This project is a Kubernetes serverless framework and OCI container function platform. It provides a system for deploying event-driven functions and microservices as compatible container images onto a Kubernetes cluster. The platform includes an event-driven function orchestrator that triggers executions via HTTP requests or message streams. It features an auto-scaling function manager that adjusts the number of active instances based on real-time demand and scales down to zero during inactivity. A background queuing system is included to process asynchronous tasks and maintain application re