This is a classifier-guided diffusion framework for high-fidelity image generation. It implements a cascaded diffusion pipeline that chains a base diffusion model with a dedicated upsampler to progressively increase image resolution in stages, and uses classifier-guided diffusion sampling to steer the reverse diffusion process toward higher-quality outputs.
The framework provides tools for training diffusion models from scratch using distributed processes with gradient accumulation, as well as training classifier models that provide gradient-based guidance during sampling. It supports both unconditional image generation and classifier-guided synthesis, and includes a dedicated upsampling module for increasing image resolution through a diffusion-based pipeline.
The system is built around a noise-prediction denoising objective with a timestep-embedded U-Net backbone, modeling the diffusion process as a discrete-time Markov chain of Gaussian transitions. Documentation covers model training, classifier training, and sampling from both unconditional and guided models.