←BackFMInference/FlexGenArchived0Copy as MarkdownView on GitHub↗9,366 stars·591 forks·Python·Apache-2.0·0 viewsFlexGenFeaturesHardware Optimized Inference - High-throughput generative inference optimized for single GPU environments.Inference Frameworks - High-throughput generative inference on single-GPU systems.Model Serving Engines - Throughput-oriented inference engine for running models on single GPUs.