What are the best Awesome Memory Offloading Frameworks GitHub Repositories?

Question 1

Accepted Answer

Systems that manage the movement of tensors between different memory tiers to run oversized models.

**Distinct from GPU Memory Optimizations:** Distinct from general GPU memory optimizations by focusing specifically on the offloading of tensors to CPU/Disk.

Explore 8 awesome GitHub repositories matching operating systems & systems programming · Memory Offloading Frameworks. Refine with filters or upvote what's useful. Top picks: openrlhf/openrlhf, fminference/flexllmgen, fminference/flexgen,…

Question 2

Why is openrlhf/openrlhf a recommended Memory Offloading Frameworks GitHub Repositories repository?

Accepted Answer

Reduces GPU memory footprint through gradient checkpointing and offloading optimizer states to secondary storage.

Question 3

Why is fminference/flexllmgen a recommended Memory Offloading Frameworks GitHub Repositories repository?

Accepted Answer

Stores model parameters, attention cache, and hidden states across GPU, CPU, and disk to fit models larger than available GPU memory.

Question 4

Why is fminference/flexgen a recommended Memory Offloading Frameworks GitHub Repositories repository?

Accepted Answer

Implements a mechanism to move model tensors between GPU memory, system RAM, and disk.

Question 5

Why is tiiny-ai/powerinfer a recommended Memory Offloading Frameworks GitHub Repositories repository?

Accepted Answer

Offloads model tensors and dense layers to video memory to increase computation speed.

Question 6

Why is predibase/lorax a recommended Memory Offloading Frameworks GitHub Repositories repository?

Accepted Answer

Optimizes throughput by asynchronously prefetching and offloading adapters between GPU and CPU memory.

Question 7

Why is vllm-project/llm-compressor a recommended Memory Offloading Frameworks GitHub Repositories repository?

Accepted Answer

Utilizes sequential onloading and disk offloading to quantize models that exceed available system memory.

Question 8

Why is llm-d/llm-d a recommended Memory Offloading Frameworks GitHub Repositories repository?

Accepted Answer

Implements tiered cache offloading by moving memory blocks between GPU memory, host RAM, and shared storage for long-context workloads.

Question 9

Why is rlinf/rlinf a recommended Memory Offloading Frameworks GitHub Repositories repository?

Accepted Answer

Manages the movement of weights, gradients, and optimizers between memory tiers to prevent out-of-memory errors.

Awesome GitHub RepositoriesMemory Offloading Frameworks

OpenRLHF/OpenRLHF

FMInference/FlexLLMGen

FMInference/FlexGen

Tiiny-AI/PowerInfer

predibase/lorax

vllm-project/llm-compressor

llm-d/llm-d

RLinf/RLinf

Explorează sub-etichetele