30 open-source projects similar to divamgupta/diffusionbee-stable-diffusion-ui, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Diffusionbee Stable Diffusion Ui alternative.
This project is a plugin for Krita that integrates Stable Diffusion image generation and editing tools directly into the painting interface. It functions as a remote diffusion backend client, bridging the digital canvas to local or remote servers to handle the computation required for AI image generation. The system distinguishes itself through a real-time painting interface that translates brushstrokes into generated imagery as the artist works. It acts as a structural orchestrator, using sketches, depth maps, and poses to maintain precise composition, and provides a generative inpainting to
ComfyUI is a modular generative AI workflow orchestrator and node-based GUI for designing and executing complex diffusion model pipelines. It functions as both a visual interface for building generative logic graphs and a programmable backend API that exposes diffusion model operations for external integration. The system distinguishes itself through a graph-based execution model that supports differential workflow execution, re-running only modified nodes to reduce computation. It features dynamic model offloading to manage memory between system RAM and GPU VRAM and utilizes metadata-embedde
MochiDiffusion is a local client for Stable Diffusion that functions as an AI image generation studio. It provides a workspace for performing text-to-image, image-to-image, and inpainting tasks, enabling the production of high-resolution images offline using local hardware and neural engine acceleration. The project includes a local model manager for importing, organizing, and converting machine learning models into compatible formats for offline execution. It features a ControlNet integration tool to guide structural composition and spatial layout, alongside a dedicated image upscaler that u
This project is a neural network extension for Stable Diffusion that provides spatial control and geometric consistency for text-to-image generation. It functions as an image structure controller and conditioning tool, enabling the use of external inputs to guide the layout and geometry of generated imagery. The framework is distinguished by its ability to transform input images into structural guides through various preprocessors. These include the extraction of depth maps, normal maps, and human pose landmarks, as well as the detection of Canny edges, anime lineart, and straight architectur
Flux is a diffusion model inference engine designed for text-to-image generation and image-to-image manipulation. It provides a system for executing open-weight models to transform natural language descriptions into visual imagery or to modify existing images. The project distinguishes itself through a flow-matching framework for image generation and a structural image controller. This controller allows for guided synthesis by using depth maps and Canny edge detection to constrain the geometry and composition of the output. The toolkit covers a broad range of image editing capabilities, incl
Stable Diffusion Web UI is a browser-based interface for generating, editing, and upscaling images and videos using latent diffusion models. It functions as a text-to-image generator, an AI image editor, and a tool for increasing image resolution and clarity. The system includes capabilities for custom model training, specifically allowing the creation of textual inversion embeddings to teach a model new concepts and visual styles from user photos. It also provides tools for AI video production, generating short clips from text prompts. The software covers image-to-image transformation, imag
sd-scripts is a suite of utilities designed for fine-tuning generative models, preprocessing datasets, and converting model weights. It provides a collection of scripts for executing Stable Diffusion training through methods such as DreamBooth, textual inversion, and full fine-tuning, alongside a framework for creating and managing Low-Rank Adaptation weights. The project features specialized capabilities for model weight conversion between different architectures and precision formats. It includes tools for merging adaptation weights into base models, extracting weights from trained models,
Sygil-webui is a web interface for Stable Diffusion latent diffusion models, providing a creative suite for text-to-image and text-to-video synthesis. It functions as an image generation tool and a latent diffusion image editor, allowing users to create visuals and video sequences from textual descriptions. The project includes a dedicated model training interface for creating custom textual inversion embeddings, which introduces specific new concepts or styles into the diffusion models. It also features specialized tools for generative image editing, including mask-based inpainting, image-to
Diffusers is a PyTorch-based library and generative AI framework used to build, train, and deploy diffusion pipelines for producing multi-modal media. It provides a suite of tools for generating images, video, and audio from natural language descriptions, as well as specialized systems for text-to-image generation. The project differentiates itself through a modular architecture that separates noise schedulers, pretrained model blocks, and pipeline compositions. This structure allows for the construction of custom generation workflows and the ability to swap individual components of the diffu
Lama Cleaner is an AI-powered image editing application focused on inpainting, object removal, and generative filling. It provides a suite of tools for erasing unwanted elements from photos and filling the resulting gaps using generative artificial intelligence. The project includes specialized capabilities for image outpainting to extend borders, background removal through object segmentation, and face restoration to fix visual defects. It also features an image upscaler to increase resolution and clarity via super-resolution AI, as well as a Stable Diffusion-based editor for replacing speci
ComfyUI-nunchaku is a 4-bit diffusion inference engine and a set of nodes for running low-precision quantized diffusion models within ComfyUI visual workflows. It provides a backend that reduces memory overhead and increases generation speed for transformer models. The project includes specialized tools for identity-preserving generation and an image-to-image guidance toolkit that uses depth maps and reference images. It also features a multimodal visual question answering implementation and a utility for merging multiple quantized model files into single unified files. The engine covers a b
SD.Next is an all-in-one web interface and multi-backend inference engine for generating, editing, and processing images and videos using diffusion models. It functions as a comprehensive tool for diffusion model management and an automated image processing pipeline for bulk operations. The project is distinguished by its hardware-backend abstraction layer, which provides automatic detection and acceleration for NVIDIA CUDA, AMD ROCm, Intel OpenVINO, and DirectML. It features a headless generative API and a programmatic command interface, allowing users to trigger tasks via REST API or CLI wi
This project is an extension for Stable Diffusion that provides an image-to-image control framework. It serves as a multi-control constraint manager and structural data preprocessor, allowing users to guide the layout and composition of generated images through spatial maps and structural constraints. The system enables multi-constraint image generation by combining several different control inputs to enforce multiple stylistic or spatial rules within a single generation pass. It provides tools for visual image referencing and precise geometric or anatomical templating to ensure generated ima
This project provides a cloud-based notebook configuration for deploying a Stable Diffusion web interface. It functions as a specialized environment for image generation, incorporating a model trainer for fine-tuning weights and creating training datasets. The system emphasizes infrastructure persistence by saving software installations and model files to cloud storage, avoiding repetitive setups between sessions. It uses a tunnel-based interface to expose the web dashboard to a public URL for remote interaction. The project covers end-to-end AI workflows, including dataset preparation and t
IF is a text-to-image diffusion system that translates natural language descriptions into visual imagery. The project provides a generative pipeline for creating images, an inpainting tool for modifying specific image sections, and a super-resolution upscaler to increase pixel density and clarity. The system includes a concept fine-tuning framework that allows for the teaching of new visual concepts by updating a small set of parameters. It also supports image style transfer to apply the aesthetic characteristics of a reference image to a new output.
Dream Textures is a Stable Diffusion integration for Blender that provides tools for text-to-image generation, depth projection, and node-based processing within a 3D environment. It functions as an AI texture generator capable of producing image textures and concept art from text prompts and scene renders. The system features a depth-to-image projection tool that maps generated imagery onto 3D models using depth data for spatial alignment. It also includes a node-based AI image processor for creating procedural visual effects and a dedicated toolset for AI-assisted inpainting and outpainting
Easy Diffusion is a desktop application that generates images from text descriptions using AI. It provides a straightforward interface for creating visuals by simply typing what you want to see, with the ability to preview images as they are being generated. The application supports loading custom AI models, allowing users to switch between different artistic capabilities and styles. It includes tools for editing existing images through text prompts or masks, applying predefined artistic styles like "Realistic" or "Pencil Sketch", and upscaling or correcting facial details after generation. F
This project is a comprehensive generative AI prompt library and image generation toolkit designed to streamline the creation of professional visual assets. It provides a curated collection of structured text instructions and templates that guide generative models to produce specific creative outputs, ranging from marketing materials to complex infographics. The toolkit distinguishes itself through specialized capabilities for maintaining visual continuity and applying consistent aesthetic transformations. It features reference-based identity preservation to anchor facial features across mult
Neural Compressor is a deep learning model compression toolkit and AI inference acceleration engine. It functions as an automated model quantization tool and hardware-aware model compiler designed to reduce the memory footprint of neural networks and decrease execution latency. The project provides specialized frameworks for optimizing large language models, utilizing weight-only quantization and hardware-specific kernels to improve the operational efficiency of generative AI workloads. It maps neural network operators to specialized CPU and GPU vector instructions to accelerate model executi
This project is a plugin for Photoshop that integrates Stable Diffusion backends, allowing users to generate and edit AI images directly within the graphic design workspace. It serves as an interface bridge between the image editor and remote GPU workers to perform generative tasks without requiring local hardware power. The plugin specifically provides connection layers for Automatic1111 and ComfyUI backends. This enables the execution of text-to-image generation, inpainting, and outpainting operations on the design canvas by communicating with these external engines via an API. The system
ComfyUI-Easy-Use is a custom node suite and workflow optimizer designed to simplify Stable Diffusion generation pipelines. It provides a set of integrated tools to reduce visual clutter and streamline the process of creating images from text and existing image references. The project distinguishes itself through a pipeline manager that consolidates models, conditioning, and latents into unified data pipes, eliminating complex wiring in the node graph. It also introduces a logical operator set that enables conditional if-else branching and for-loop structures directly within the visual program
This repository is a collection of node-based pipeline configurations, examples, and templates for generating AI media. It provides a workflow library and a curated gallery of blueprints designed for creating images, videos, and 3D assets using diffusion models. The project specifically offers a set of pre-configured node graphs for implementing advanced image generation and refinement techniques, with a focus on Stable Diffusion workflows. These examples demonstrate how to interconnect processing nodes to define complex generative logic without writing code. The available templates cover a
This is a PyTorch-based implementation of diffusion models for synthesizing photorealistic images and video. It provides a framework for text-to-image and text-to-video generation, as well as unconditional image synthesis. The system utilizes a cascading diffusion pipeline to produce high-resolution imagery by passing low-resolution outputs through a sequence of super-resolution models. It also includes capabilities for image inpainting, allowing the reconstruction of masked or missing regions of visual media guided by surrounding context and text prompts. The project includes tools for diff
Nebullvm is an AI inference accelerator, GPU resource orchestrator, and performance optimization library for large language models. It functions as an optimization layer designed to lower operational costs by aligning model execution with underlying hardware architectures. The system maximizes cluster efficiency through real-time dynamic partitioning and elastic quotas for shared hardware resources. It employs alignment methods and techniques to reduce the hardware and data requirements necessary for tuning large language models. The project covers broad capability areas including AI infrast
StableCascade is a generative AI system and latent diffusion framework designed for text-to-image synthesis and image-to-image transformations. It utilizes a multi-stage cascade architecture that encodes and decodes images via a latent space to produce high-fidelity visual imagery. The system includes a cascade diffusion pipeline for controlling image structure through inpainting, outpainting, and super-resolution. It also provides a toolkit for image-to-image generation and the creation of image variations using embeddings. The framework supports model optimization through low-rank adaptati
Learning-Prompt is a collection of educational resources and step-by-step guides designed for mastering large language model interaction and text-to-image tools. It provides a guided course on prompt engineering for large language models alongside tutorials for creating visual content with generative AI. The project utilizes a curated curriculum that organizes material into sequential lessons and modular tracks. Instruction is delivered through step-by-step tutorials and an iterative framework of drafting, testing, and refining prompts, using side-by-side comparisons of raw and optimized exam
This project is a containerized deployment for running Stable Diffusion web interfaces. It provides a portable runtime for generative AI that manages dependencies and hardware acceleration to enable text-to-image generation and image-to-image transformations via a browser-based interface. The system uses hardware-specific image tags to support both GPU-accelerated synthesis and CPU-only execution. It ensures environment isolation across different operating systems while utilizing bind-mount data persistence to keep heavy model weights and generated outputs on the host machine. The deployment
mmagic is a multimodal training pipeline and framework for generative AI, focusing on visual synthesis and restoration. It provides the infrastructure to build and train models for tasks such as text-to-image and text-to-video generation, 3D-aware content synthesis, and high-fidelity image translation using diffusion models and generative adversarial networks. The project distinguishes itself through specialized capabilities for generative model personalization, including techniques for fine-tuning subjects and styles. It also supports advanced visual manipulations such as latent space interp
TaskMatrix is a visual language model orchestration framework and modular visual pipeline designed to coordinate disparate foundation models. It functions as a multi-model workflow coordinator that sequences visual and textual models through logic paths to handle image processing tasks without requiring additional training. The system integrates large language models with visual foundation models to enable the exchange of image data during interactive chat sessions. It utilizes template-based orchestration to chain specialized models together for complex visual tasks. The framework supports