2 repository-uri
Tools for defining regional bounding boxes and attention weights to guide the spatial output of diffusion models.
Distinct from Diffusion Models: Existing candidates focus on the models themselves or model management, not the spatial layout control logic.
Explore 2 awesome GitHub repositories matching artificial intelligence & ml · Diffusion Layout Controllers. Refine with filters or upvote what's useful.
Omost is a system of software components designed for iterative image refinement, regional layout control, and the optimization of text-to-image embedding processes. It functions as a diffusion model layout controller and an engine that uses large language models to generate executable code for precise control over image composition. The project features a conversational image editor that allows for the refinement of visual content through natural language instructions and automated code execution. It distinguishes itself through a text embedding optimizer that organizes sub-prompts into tree
Defines regional bounding boxes and attention scores to guide the generation process of diffusion models.
IP-Adapter is a framework for conditioning pretrained text-to-image diffusion models to use image prompts as visual guides. It serves as a text-to-image model extension that transforms a text-based diffusion model to accept and process image inputs as primary generation sources. The system implements identity preservation to maintain consistent facial features across multiple outputs using a reference photo. It also enables style transfer workflows to produce image variations that preserve the artistic characteristics of a source image. Capabilities cover multi-modal prompting, including the
Combines image prompts with structural constraints to control the spatial composition of generated art.