# internlm/lmdeploy

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/internlm-lmdeploy).**

7,903 stars · 701 forks · Python · Apache-2.0

## Links

- GitHub: https://github.com/InternLM/lmdeploy
- Homepage: https://lmdeploy.readthedocs.io/en/latest
- awesome-repositories: https://awesome-repositories.com/repository/internlm-lmdeploy.md

## Topics

`codellama` `cuda-kernels` `deepspeed` `fastertransformer` `internlm` `llama` `llama2` `llama3` `llm` `llm-inference` `turbomind`

## Description

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

## Tags

### Part of an Awesome List

- [Inference and Serving](https://awesome-repositories.com/f/awesome-lists/ai/inference-and-serving.md) — High-throughput and low-latency serving framework for LLMs.
- [Inference Engines](https://awesome-repositories.com/f/awesome-lists/ai/inference-engines.md) — Toolkit for compressing, deploying, and serving large language models.
- [Inference Frameworks](https://awesome-repositories.com/f/awesome-lists/ai/inference-frameworks.md) — Toolkit for compressing, deploying, and serving language models.
- [Model Serving & Deployment](https://awesome-repositories.com/f/awesome-lists/ai/model-serving-deployment.md) — Compresses and deploys LLMs for production.
- [Inference Frameworks](https://awesome-repositories.com/f/awesome-lists/devtools/inference-frameworks.md) — Framework for quantization, inference, and serving of LLMs and VLMs.