# flashinfer-ai/flashinfer

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/flashinfer-ai-flashinfer).**

4,996 stars · 725 forks · Python · apache-2.0

## Links

- GitHub: https://github.com/flashinfer-ai/flashinfer
- Homepage: https://flashinfer.ai
- awesome-repositories: https://awesome-repositories.com/repository/flashinfer-ai-flashinfer.md

## Topics

`attention` `cuda` `distributed-inference` `gpu` `jit` `large-large-models` `llm-inference` `moe` `nvidia` `pytorch`

## Tags

### Part of an Awesome List

- [Inference Engines](https://awesome-repositories.com/f/awesome-lists/ai/inference-engines.md) — Kernel library optimized for LLM serving performance.
- [Inference Frameworks](https://awesome-repositories.com/f/awesome-lists/ai/inference-frameworks.md) — Kernel library specifically optimized for LLM serving workloads.
