0 repos
Optimizes the sequential token generation process in language models to reduce latency.
Distinguishing note: Focuses on the autoregressive decoding phase specifically, rather than general model inference.
No awesome GitHub repositories for artificial intelligence & ml · Decoding Accelerators yet. Submit a GitHub URL or browse the filters below.
No repositories listed yet — be the first to submit one.