# city96/comfyui-gguf

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/city96-comfyui-gguf).**

3,291 stars · 257 forks · Python · apache-2.0

## Links

- GitHub: https://github.com/city96/ComfyUI-GGUF
- awesome-repositories: https://awesome-repositories.com/repository/city96-comfyui-gguf.md

## Description

ComfyUI-GGUF is a memory optimizer and model loader for ComfyUI that enables the execution of large transformer-based generative models using quantized weights. It provides a system for loading GGUF formatted weights within a node-based diffusion interface to reduce GPU memory consumption.

The project includes a quantization tool for converting standard model checkpoints into compressed binary formats and a tensor fixer to restore missing keys and correct architectures in binary model files. These utilities ensure that compressed models remain functional during inference on hardware with limited VRAM.

The framework covers model weight optimization and low-memory inference by supporting the loading of quantized diffusion models and text encoders. It manages the process of on-the-fly precision recovery and weight mapping to maintain performance while reducing the total memory footprint.

## Tags

### Development Tools & Productivity

- [ComfyUI Custom Node Suites](https://awesome-repositories.com/f/development-tools-productivity/comfyui-custom-node-suites.md) — Integrates compressed GGUF diffusion models and text encoders as custom nodes within ComfyUI workflows.

### Artificial Intelligence & ML

- [Diffusion Model Memory Optimizers](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-models/diffusion-model-memory-optimizers.md) — Provides a framework for running large transformer-based generative models using quantized weights.
- [Diffusion Models](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-models/diffusion-models.md) — Initializes and runs generative image synthesis models based on quantized diffusion architectures. ([source](https://github.com/city96/ComfyUI-GGUF/blob/main/README.md))
- [GGUF Execution](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-optimization-and-inference/serving-and-runtime/large-language-model-optimization/model-inference-optimizations/gguf-execution.md) — Implements optimized runtime execution and loading for models using the GGUF quantization format.
- [Model Quantization Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-optimization-and-inference/serving-and-runtime/model-quantization-tools.md) — Handles on-the-fly precision recovery of quantized weights to reduce total memory usage during inference. ([source](https://github.com/city96/ComfyUI-GGUF#readme))
- [Model Loading](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/data-and-checkpointing/model-loading.md) — Provides mechanisms for efficiently loading model weights from portable binary formats into memory. ([source](https://github.com/city96/ComfyUI-GGUF/blob/main/.gitignore))
- [GGUF Weight Quantization](https://awesome-repositories.com/f/artificial-intelligence-ml/model-weight-converters/gguf-weight-quantization.md) — Provides a utility to convert standard model checkpoints into quantized GGUF artifacts for lower memory usage.
- [Quantized Model Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/quantized-inference-runtimes/weight-quantization/quantized-model-implementations.md) — Implements low-precision weight formats to reduce inference memory requirements on limited hardware. ([source](https://github.com/city96/ComfyUI-GGUF/blob/main/pyproject.toml))
- [Tensor Shape Inferences](https://awesome-repositories.com/f/artificial-intelligence-ml/dynamic-tensor-shapes/tensor-shape-inferences.md) — Corrects tensor shapes and restores missing keys in model architectures to ensure functional inference. ([source](https://github.com/city96/ComfyUI-GGUF/tree/main/tools))
- [Transformer Weight Loading](https://awesome-repositories.com/f/artificial-intelligence-ml/large-scale-model-training/vision-transformer-pre-training/pre-trained-model-checkpoints/vision-model-weight-loading/transformer-weight-loading.md) — Optimizes the loading of compressed weights specifically for transformer architectures to reduce initialization overhead.
- [Ternary Weight Optimizations](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/model-fine-tuning-adaptation/language-model-training/ternary-weight-optimizations.md) — Reduces the memory footprint of transformer architectures to enable faster operation on consumer GPUs.
- [Model Architecture Key Restoration](https://awesome-repositories.com/f/artificial-intelligence-ml/model-architecture-key-restoration.md) — Restores missing architectural keys in model binaries to ensure they remain functional during inference.
- [GGUF Format Conversions](https://awesome-repositories.com/f/artificial-intelligence-ml/model-format-converters/gguf-format-conversions.md) — Transforms standard model checkpoints into the GGUF format for quantized local inference. ([source](https://github.com/city96/ComfyUI-GGUF/tree/main/tools))
- [Weight Tensor Mapping](https://awesome-repositories.com/f/artificial-intelligence-ml/model-optimization/inference-deployment/model-deployment-toolkits/hardware-agnostic-deployment/weight-tensor-mapping.md) — Maps compressed tensors to corresponding model layers to ensure hardware compatibility across various precision levels.
- [Model Quantization](https://awesome-repositories.com/f/artificial-intelligence-ml/model-quantization.md) — Reduces model weight precision to decrease the memory footprint and accelerate inference performance. ([source](https://github.com/city96/ComfyUI-GGUF/tree/main/tools))
- [Checkpoint Format Transpilations](https://awesome-repositories.com/f/artificial-intelligence-ml/model-weight-export-formats/checkpoint-format-transpilations.md) — Provides utilities to convert standard model checkpoints into unified binary structures for portable distribution.
- [Model Weight Repair Tools](https://awesome-repositories.com/f/artificial-intelligence-ml/model-weight-repair-tools.md) — Includes a tensor fixer to restore missing architectural keys and correct tensor structures in binary model files.

### Data & Databases

- [Weight Dequantization](https://awesome-repositories.com/f/data-databases/in-memory-data-stores/memory-isolated-decryption/on-the-fly-decryption/weight-dequantization.md) — Enables high-performance inference by recovering weight precision on-the-fly from compressed memory footprints.
- [Transformer Text Encoders](https://awesome-repositories.com/f/data-databases/data-categorization/categorical-encoders/cardinality-based-text-encoders/transformer-text-encoders.md) — Loads compressed transformer-based text encoders to decrease the total memory required during initialization. ([source](https://github.com/city96/ComfyUI-GGUF#readme))

### Software Engineering & Architecture

- [Low-VRAM Inference](https://awesome-repositories.com/f/software-engineering-architecture/memory-layout-optimizations/bit-packed-storage/low-bit-inference-engines/quantized-model-persistence/low-vram-inference.md) — Runs large generative models on hardware with limited VRAM by loading quantized GGUF weights.