Lmdeploy

Inference and Serving - High-throughput and low-latency serving framework for LLMs.

Inference Engines - Toolkit for compressing, deploying, and serving large language models.

Inference Frameworks - Toolkit for compressing, deploying, and serving language models.

Model Serving & Deployment - Compresses and deploys LLMs for production.

Inference Frameworks - Framework for quantization, inference, and serving of LLMs and VLMs.

InternLMlmdeploy