1 repo
Methods for splitting neural network layers across multiple hardware devices.
Distinguishing note: Focuses on intra-layer splitting, distinct from pipeline or data parallelism.
Explore 1 awesome GitHub repository matching artificial intelligence & ml · Tensor Parallelism Frameworks. Refine with filters or upvote what's useful.
ColossalAI is a distributed deep learning framework designed for training and deploying massive artificial intelligence models across clusters of hardware accelerators. It functions as a parallel computing engine that partitions model workloads and data across multiple processors to maximize memory efficiency and throughput. The platform distinguishes itself through a comprehensive suite of parallelization strategies, including multi-dimensional tensor parallelism and pipeline-based model parallelism, which segment neural network layers and stages across devices. To support large-scale genera
Splits individual model layers across multiple hardware accelerators to reduce the memory footprint of massive neural network parameters.