# mit-han-lab/quest

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/mit-han-lab-quest).**

394 stars · 50 forks · Cuda · MIT

## Links

- GitHub: https://github.com/mit-han-lab/Quest
- awesome-repositories: https://awesome-repositories.com/repository/mit-han-lab-quest.md

## Description

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

## Tags

### Part of an Awesome List

- [Attention Optimization](https://awesome-repositories.com/f/awesome-lists/ai/attention-optimization.md) — Uses query-aware sparsity to improve long-context efficiency.
