awesome-repositories.com
© 2026 Bringes Technology SRL·VAT RO45896025·hello@bringes.io
MCPSitemapPrivacyTerms
Inference Latency Optimizers · Awesome GitHub Repositories

1 repo

Awesome GitHub RepositoriesInference Latency Optimizers

Tools for tuning inference performance and response times.

Distinguishing note: Focuses on latency tuning, distinct from general throughput optimization.

Explore 1 awesome GitHub repository matching artificial intelligence & ml · Inference Latency Optimizers. Refine with filters or upvote what's useful.

  1. Home
  2. Artificial Intelligence & ML
  3. Inference Latency Optimizers

Awesome Inference Latency Optimizers GitHub Repositories

Describe the repository you're looking for…
Find the best repos with AI.We'll search the best matching repositories with AI.
  • hpcaitech/ColossalAI

    hpcaitech/ColossalAI

    41,349View on GitHub↗

    ColossalAI is a distributed deep learning framework designed for training and deploying massive artificial intelligence models across clusters of hardware accelerators. It functions as a parallel computing engine that partitions model workloads and data across multiple processors to maximize memory efficiency and throughput. The platform distinguishes itself through a comprehensive suite of parallelization strategies, including multi-dimensional tensor parallelism and pipeline-based model parallelism, which segment neural network layers and stages across devices. To support large-scale genera

    Improves response times for generation tasks by configuring request grouping and memory caching.

    Pythonaibig-modeldata-parallelism
    41,349View on GitHub↗