30 open-source projects similar to carson-katri/dream-textures, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Dream Textures alternative.
This project is a plugin for Krita that integrates Stable Diffusion image generation and editing tools directly into the painting interface. It functions as a remote diffusion backend client, bridging the digital canvas to local or remote servers to handle the computation required for AI image generation. The system distinguishes itself through a real-time painting interface that translates brushstrokes into generated imagery as the artist works. It acts as a structural orchestrator, using sketches, depth maps, and poses to maintain precise composition, and provides a generative inpainting to
ComfyUI is a modular generative AI workflow orchestrator and node-based GUI for designing and executing complex diffusion model pipelines. It functions as both a visual interface for building generative logic graphs and a programmable backend API that exposes diffusion model operations for external integration. The system distinguishes itself through a graph-based execution model that supports differential workflow execution, re-running only modified nodes to reduce computation. It features dynamic model offloading to manage memory between system RAM and GPU VRAM and utilizes metadata-embedde
ComfyUI-3D-Pack is a suite of custom nodes for ComfyUI that enables 3D asset generation and rendering within a node-based workflow. It provides a set of tools for reconstructing textured three-dimensional meshes and volumetric scenes from single images, multi-view images, or text prompts. The system includes a Gaussian splatting generator for creating high-fidelity volumetric 3D scene representations and a multi-view image generator to produce consistent image sets for reconstruction. It also features a single image 3D mesh tool to build geometry from a single 2D source. The toolset covers 3
ComfyUI-nunchaku is a 4-bit diffusion inference engine and a set of nodes for running low-precision quantized diffusion models within ComfyUI visual workflows. It provides a backend that reduces memory overhead and increases generation speed for transformer models. The project includes specialized tools for identity-preserving generation and an image-to-image guidance toolkit that uses depth maps and reference images. It also features a multimodal visual question answering implementation and a utility for merging multiple quantized model files into single unified files. The engine covers a b
sd-scripts is a suite of utilities designed for fine-tuning generative models, preprocessing datasets, and converting model weights. It provides a collection of scripts for executing Stable Diffusion training through methods such as DreamBooth, textual inversion, and full fine-tuning, alongside a framework for creating and managing Low-Rank Adaptation weights. The project features specialized capabilities for model weight conversion between different architectures and precision formats. It includes tools for merging adaptation weights into base models, extracting weights from trained models,
Open-Higgsfield-AI is a generative AI content studio and visual workflow orchestrator. It provides a unified interface for creating photorealistic images and videos, utilizing a node-based editor to chain multiple image, video, and audio models into automated content pipelines. The system functions as an AI video animation tool and local GPU inference engine, allowing users to run generative models on local hardware or remote servers. It includes specialized capabilities for audio-driven lip synchronization and cinematic camera controls to adjust virtual lens and focal settings. The platform
BasicSR is a PyTorch-based image restoration toolbox and framework designed for training and deploying deep learning models to upscale, denoise, and deblur images and videos. It serves as a comprehensive system for image super-resolution and video quality restoration, providing the necessary infrastructure to recover fine visual details and increase pixel density. The project distinguishes itself through specialized toolkits for facial image enhancement and high-fidelity face synthesis, as well as a dedicated video quality restoration suite that utilizes deformable convolutions and generative
ComfyUI is a node-based generative AI orchestration engine designed for constructing, testing, and executing complex image and video synthesis pipelines. By utilizing a directed acyclic graph execution model, the platform allows users to build reproducible workflows through modular, interconnected processing blocks without requiring manual code implementation. It serves as both a local environment for high-performance model inference and a production-ready server for deploying generative capabilities. The platform distinguishes itself through its focus on workflow portability and extensibilit
This repository is a collection of node-based pipeline configurations, examples, and templates for generating AI media. It provides a workflow library and a curated gallery of blueprints designed for creating images, videos, and 3D assets using diffusion models. The project specifically offers a set of pre-configured node graphs for implementing advanced image generation and refinement techniques, with a focus on Stable Diffusion workflows. These examples demonstrate how to interconnect processing nodes to define complex generative logic without writing code. The available templates cover a
DiffusionBee is a Stable Diffusion desktop client for macOS that functions as an AI image generator and editor. It allows for the local generation of images from text prompts and the management of diffusion models without requiring external cloud services or technical setup. The application includes a local diffusion model manager for importing and switching between custom trained model files to achieve specific artistic styles. It also features a system for tracking generation history and uploading assets to a public gallery. The software covers several image synthesis and manipulation work
IF is a text-to-image diffusion system that translates natural language descriptions into visual imagery. The project provides a generative pipeline for creating images, an inpainting tool for modifying specific image sections, and a super-resolution upscaler to increase pixel density and clarity. The system includes a concept fine-tuning framework that allows for the teaching of new visual concepts by updating a small set of parameters. It also supports image style transfer to apply the aesthetic characteristics of a reference image to a new output.
This is a PyTorch-based implementation of diffusion models for synthesizing photorealistic images and video. It provides a framework for text-to-image and text-to-video generation, as well as unconditional image synthesis. The system utilizes a cascading diffusion pipeline to produce high-resolution imagery by passing low-resolution outputs through a sequence of super-resolution models. It also includes capabilities for image inpainting, allowing the reconstruction of masked or missing regions of visual media guided by surrounding context and text prompts. The project includes tools for diff
Diffusers is a PyTorch-based library and generative AI framework used to build, train, and deploy diffusion pipelines for producing multi-modal media. It provides a suite of tools for generating images, video, and audio from natural language descriptions, as well as specialized systems for text-to-image generation. The project differentiates itself through a modular architecture that separates noise schedulers, pretrained model blocks, and pipeline compositions. This structure allows for the construction of custom generation workflows and the ability to swap individual components of the diffu
Flux is a diffusion model inference engine designed for text-to-image generation and image-to-image manipulation. It provides a system for executing open-weight models to transform natural language descriptions into visual imagery or to modify existing images. The project distinguishes itself through a flow-matching framework for image generation and a structural image controller. This controller allows for guided synthesis by using depth maps and Canny edge detection to constrain the geometry and composition of the output. The toolkit covers a broad range of image editing capabilities, incl
Neural Doodle is a collection of neural network tools designed for image upscaling, texture synthesis, and semantic-guided style transfer between visual inputs. It provides a semantic style transfer engine and an example-based image upscaler that increase image resolution by referencing visual details from a target style example. The project includes a neural texture synthesizer for creating seamless bitmap textures and repeating patterns from a single input style image. It also functions as an image generation tool capable of transforming simple sketches and photos into detailed artwork. Th
This project is a comprehensive generative AI prompt library and image generation toolkit designed to streamline the creation of professional visual assets. It provides a curated collection of structured text instructions and templates that guide generative models to produce specific creative outputs, ranging from marketing materials to complex infographics. The toolkit distinguishes itself through specialized capabilities for maintaining visual continuity and applying consistent aesthetic transformations. It features reference-based identity preservation to anchor facial features across mult
This project is a containerized deployment for running Stable Diffusion web interfaces. It provides a portable runtime for generative AI that manages dependencies and hardware acceleration to enable text-to-image generation and image-to-image transformations via a browser-based interface. The system uses hardware-specific image tags to support both GPU-accelerated synthesis and CPU-only execution. It ensures environment isolation across different operating systems while utilizing bind-mount data persistence to keep heavy model weights and generated outputs on the host machine. The deployment
mmagic is a multimodal training pipeline and framework for generative AI, focusing on visual synthesis and restoration. It provides the infrastructure to build and train models for tasks such as text-to-image and text-to-video generation, 3D-aware content synthesis, and high-fidelity image translation using diffusion models and generative adversarial networks. The project distinguishes itself through specialized capabilities for generative model personalization, including techniques for fine-tuning subjects and styles. It also supports advanced visual manipulations such as latent space interp
This project is a plugin for Photoshop that integrates Stable Diffusion backends, allowing users to generate and edit AI images directly within the graphic design workspace. It serves as an interface bridge between the image editor and remote GPU workers to perform generative tasks without requiring local hardware power. The plugin specifically provides connection layers for Automatic1111 and ComfyUI backends. This enables the execution of text-to-image generation, inpainting, and outpainting operations on the design canvas by communicating with these external engines via an API. The system
TaskMatrix is a visual language model orchestration framework and modular visual pipeline designed to coordinate disparate foundation models. It functions as a multi-model workflow coordinator that sequences visual and textual models through logic paths to handle image processing tasks without requiring additional training. The system integrates large language models with visual foundation models to enable the exchange of image data during interactive chat sessions. It utilizes template-based orchestration to chain specialized models together for complex visual tasks. The framework supports
A free and open-source inpainting & image-upscaling tool powered by webgpu and wasm on the browser。| 基于 Webgpu 技术和 wasm 技术的免费开源 inpainting & image-upscaling 工具, 纯浏览器端实现。
StableCascade is a generative AI system and latent diffusion framework designed for text-to-image synthesis and image-to-image transformations. It utilizes a multi-stage cascade architecture that encodes and decodes images via a latent space to produce high-fidelity visual imagery. The system includes a cascade diffusion pipeline for controlling image structure through inpainting, outpainting, and super-resolution. It also provides a toolkit for image-to-image generation and the creation of image variations using embeddings. The framework supports model optimization through low-rank adaptati
Genkit is an open-source framework for building AI-powered applications. It provides a unified interface for connecting to hundreds of generative AI models from multiple providers, enabling text, image, audio, and video generation through a single API. The framework structures multi-step AI interactions—including chat, retrieval-augmented generation, tool use, and agentic workflows—as composable, traceable flows with built-in streaming and state management. The framework distinguishes itself through a comprehensive developer toolkit that includes a command-line interface and a local developer
InvokeAI is a self-hosted, professional-grade platform designed for managing generative models and performing complex image synthesis. It provides a local application environment that allows users to execute diffusion models directly on their own hardware, ensuring data privacy and complete ownership of all generated assets. The platform distinguishes itself through a node-based workflow system that enables the construction of reproducible and automated image generation pipelines. By chaining modular functional units into directed acyclic graphs, users can automate intricate production tasks
Material Maker is a node-based material editor, procedural texture generator, and 3D painting software. Built on the Godot engine, it provides a visual graph interface for authoring complex surface properties and textures through the connection of functional operation nodes. The tool allows for painting colors and materials directly onto the surfaces of three-dimensional geometry. It features a procedural material exporter that converts authored data into file formats compatible with external 3D modeling software and game engines. The system supports physically based rendering material creat
stable-diffusion.cpp is a high-performance C++ inference engine designed for generating images and video from text prompts using Stable Diffusion models. It functions as a latent diffusion model runtime and a lightweight machine learning framework that enables local diffusion model execution on consumer hardware. The project distinguishes itself as a CPU-based image generator capable of running without a dedicated GPU. It employs a specialized C++ tensor backend and cross-backend hardware abstraction to dispatch compute tasks across different processor instruction sets and graphics APIs. The
SillyTavern is a comprehensive interface and orchestration platform designed for immersive AI roleplay and interactive chat experiences. It functions as a unified gateway that connects users to a wide array of local and cloud-based large language models, providing a centralized environment to manage complex character personas, narrative context, and model-driven interactions. The platform distinguishes itself through its advanced prompt engineering and automation capabilities. It utilizes a sophisticated macro-based templating engine and vector-database retrieval to dynamically inject lore, c
SD.Next is an all-in-one web interface and multi-backend inference engine for generating, editing, and processing images and videos using diffusion models. It functions as a comprehensive tool for diffusion model management and an automated image processing pipeline for bulk operations. The project is distinguished by its hardware-backend abstraction layer, which provides automatic detection and acceleration for NVIDIA CUDA, AMD ROCm, Intel OpenVINO, and DirectML. It features a headless generative API and a programmatic command interface, allowing users to trigger tasks via REST API or CLI wi
This project is a toolkit for fine-tuning and managing text-to-image diffusion models. It focuses on low-rank adaptation to create small, portable weight files that customize model styles and behaviors without modifying the entire base model. The project provides specialized utilities for model distillation using singular value decomposition to extract adapters from fully trained models, as well as tools for blending and merging multiple adapters through weight interpolation. It includes capabilities for subject inversion and pivotal tuning to increase the visual fidelity of specific identiti