# efeslab/nanoflow

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/efeslab-nanoflow).**

965 stars · 49 forks · Jupyter Notebook

## Links

- GitHub: https://github.com/efeslab/Nanoflow
- Homepage: https://arxiv.org/abs/2408.12757
- awesome-repositories: https://awesome-repositories.com/repository/efeslab-nanoflow.md

## Topics

`cuda` `inference` `llama2` `llm` `llm-serving` `model-serving`

## Description

A throughput-oriented high-performance serving framework for LLMs

## Tags

### Part of an Awesome List

- [Inference Frameworks](https://awesome-repositories.com/f/awesome-lists/ai/inference-frameworks.md) — Framework focused on maximizing serving throughput.
