←Backturboderp/exllamav20Copy as MarkdownView on GitHub↗4,553 stars·337 forks·Python·MIT·0 viewsExllamav2FeaturesInference Engines - High-speed inference library for modern consumer-grade GPUs.Model Quantization - Fast inference library optimized for low-bitwidth model quantization.