6 Repos
Using spatial constraints like ControlNet, depth maps, and LoRAs to guide diffusion model output.
Distinct from Diffusion Models: Existing candidates are either generic models or specialized robotics planning, not spatial image guidance
Explore 6 awesome GitHub repositories matching artificial intelligence & ml · Diffusion Structural Control. Refine with filters or upvote what's useful.
ComfyUI is a modular generative AI workflow orchestrator and node-based GUI for designing and executing complex diffusion model pipelines. It functions as both a visual interface for building generative logic graphs and a programmable backend API that exposes diffusion model operations for external integration. The system distinguishes itself through a graph-based execution model that supports differential workflow execution, re-running only modified nodes to reduce computation. It features dynamic model offloading to manage memory between system RAM and GPU VRAM and utilizes metadata-embedde
Guides the composition of generated media using ControlNet, LoRAs, and depth maps for precise spatial layouts.
Flux is a diffusion model inference engine designed for text-to-image generation and image-to-image manipulation. It provides a system for executing open-weight models to transform natural language descriptions into visual imagery or to modify existing images. The project distinguishes itself through a flow-matching framework for image generation and a structural image controller. This controller allows for guided synthesis by using depth maps and Canny edge detection to constrain the geometry and composition of the output. The toolkit covers a broad range of image editing capabilities, incl
Uses spatial constraints like depth maps and Canny edges to guide diffusion model outputs.
This project is an extension for Stable Diffusion that provides an image-to-image control framework. It serves as a multi-control constraint manager and structural data preprocessor, allowing users to guide the layout and composition of generated images through spatial maps and structural constraints. The system enables multi-constraint image generation by combining several different control inputs to enforce multiple stylistic or spatial rules within a single generation pass. It provides tools for visual image referencing and precise geometric or anatomical templating to ensure generated ima
Extracts structural data and creates control maps to guide the spatial layout of AI-generated images.
DiffusionBee is a Stable Diffusion desktop client for macOS that functions as an AI image generator and editor. It allows for the local generation of images from text prompts and the management of diffusion models without requiring external cloud services or technical setup. The application includes a local diffusion model manager for importing and switching between custom trained model files to achieve specific artistic styles. It also features a system for tracking generation history and uploading assets to a public gallery. The software covers several image synthesis and manipulation work
Provides structural guidance for image generation using auxiliary spatial data like depth maps and ControlNet.
imaginAIry is a system for generating and refining images and videos using diffusion models. It operates as a web-based server that triggers generation requests through standard API calls, allowing for the creation of visuals and video sequences from text prompts or existing files. The project provides a suite for AI image editing and upscaling, enabling the modification of visuals through natural language instructions and super-resolution tools to increase detail and image size. The system includes capabilities for structural image control using depth maps, edge maps, and body poses to main
Injects spatial information like depth and edge maps into the diffusion process to maintain precise geometric layouts.
This project is a neural network extension for Stable Diffusion that provides spatial control and geometric consistency for text-to-image generation. It functions as an image structure controller and conditioning tool, enabling the use of external inputs to guide the layout and geometry of generated imagery. The framework is distinguished by its ability to transform input images into structural guides through various preprocessors. These include the extraction of depth maps, normal maps, and human pose landmarks, as well as the detection of Canny edges, anime lineart, and straight architectur
Provides a framework for guiding diffusion model output using spatial constraints like depth maps and semantic segmentation.