←BackInternLM/lmdeploy0Copy as MarkdownView on GitHub↗7,903 stars·701 forks·Python·Apache-2.0·0 viewslmdeploy.readthedocs.io/en/latest↗LmdeployFeaturesInference and Serving - High-throughput and low-latency serving framework for LLMs.Inference Engines - Toolkit for compressing, deploying, and serving large language models.Inference Frameworks - Toolkit for compressing, deploying, and serving language models.Model Serving & Deployment - Compresses and deploys LLMs for production.Inference Frameworks - Framework for quantization, inference, and serving of LLMs and VLMs.