Features

Inference Gateways - Accepts OpenAI-style chat and responses API requests and dispatches them to the appropriate backend model.

Intent Classification Pipelines - Reads intent, domain, and safety profile of requests using purpose-built encoders before selecting a handling model.

Model Request Routing - Sends each request to the model that best balances quality, cost, latency, and privacy.

Cost-Aware Model Routers - Routes routine traffic to cheaper models and reserves expensive frontier models for requests that need them.

vllm-projectsemantic-router

View on GitHub

3,205 stars536 forksGoapache-2.00 viewsvllm-semantic-router.com

Features

Semantic Router

Features