# Generative media and diffusion

> Search results for `Generative media and diffusion` on awesome-repositories.com. 113 total matches; showing the first 50.

Explore on the web: https://awesome-repositories.com/q/generative-media-and-diffusion

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [this search on awesome-repositories.com](https://awesome-repositories.com/q/generative-media-and-diffusion).**

## Results

- [huggingface/diffusers](https://awesome-repositories.com/repository/huggingface-diffusers.md) (33,872 ⭐) — Diffusers is a PyTorch-based library and generative AI framework used to build, train, and deploy diffusion pipelines for producing multi-modal media. It provides a suite of tools for generating images, video, and audio from natural language descriptions, as well as specialized systems for text-to-image generation.

The project differentiates itself through a modular architecture that separates noise schedulers, pretrained model blocks, and pipeline compositions. This structure allows for the construction of custom generation workflows and the ability to swap individual components of the diffu
- [androidx/media](https://awesome-repositories.com/repository/androidx-media.md) (2,680 ⭐) — Android Media is a framework library providing the primary system components for audio and video playback, session management, and media routing on Android. It includes a multimedia API for processing raw media streams, managing MIDI devices, and rendering video frames.

The project features a dedicated metadata manager for organizing descriptive labels, content channels, and DRM configurations, alongside a session controller that synchronizes playback state with external controllers and manages media shortcuts for wearable devices.

The library covers a broad range of capabilities including a
- [keras-team/keras](https://awesome-repositories.com/repository/keras-team-keras.md) (64,094 ⭐) — Keras is a high-level deep learning framework designed for constructing and training neural networks through the composition of modular, functional layers. It serves as a comprehensive modeling toolkit that provides standardized procedures for defining, evaluating, and deploying complex architectures. By utilizing a directed acyclic graph approach, the framework allows users to build intricate models with multiple inputs, outputs, and shared layers, ensuring consistent numerical execution through functional state management.

The project distinguishes itself as a multi-backend machine learning
- [compvis/latent-diffusion](https://awesome-repositories.com/repository/compvis-latent-diffusion.md) (14,072 ⭐) — Latent Diffusion is a framework for high-resolution image synthesis that performs the denoising process within a compressed latent space. It uses variational autoencoders to encode images into a lower-dimensional representation, reducing the computational cost of noise prediction compared to operating on raw pixels.

The project enables text-to-image generation by integrating natural language descriptions through cross-attention conditioning. It also supports image inpainting and restoration, filling masked or missing image areas with generated content, and example-based synthesis using retrie
- [compvis/stable-diffusion](https://awesome-repositories.com/repository/compvis-stable-diffusion.md) (73,125 ⭐) — Stable Diffusion is a generative machine learning pipeline that synthesizes high-resolution visual content by performing iterative denoising within a compressed latent space. By mapping natural language embeddings into pixel outputs through conditioned probabilistic processes, the framework enables the generation of images from text prompts and the transformation of existing visual inputs based on semantic instructions.

The architecture utilizes a modular execution environment that decouples model loading, scheduler logic, and inference components to support diverse hardware configurations. I
- [kohya-ss/sd-scripts](https://awesome-repositories.com/repository/kohya-ss-sd-scripts.md) (7,133 ⭐) — sd-scripts is a suite of utilities designed for fine-tuning generative models, preprocessing datasets, and converting model weights. It provides a collection of scripts for executing Stable Diffusion training through methods such as DreamBooth, textual inversion, and full fine-tuning, alongside a framework for creating and managing Low-Rank Adaptation weights.

The project features specialized capabilities for model weight conversion between different architectures and precision formats. It includes tools for merging adaptation weights into base models, extracting weights from trained models,
- [openai/gpt-2](https://awesome-repositories.com/repository/openai-gpt-2.md) (24,967 ⭐) — This project is a transformer-based language model and autoregressive text generator designed to predict the next token in a sequence to produce human-like prose and synthetic text. It functions as a large language model that utilizes a transformer architecture to learn linguistic patterns from large datasets for unsupervised multitask learning.

The repository provides a distribution of pre-trained weights, enabling natural language processing tasks without requiring additional training. This allows the model to perform zero-shot task generalization by applying learned patterns to new tasks.
- [automatic1111/stable-diffusion-webui](https://awesome-repositories.com/repository/automatic1111-stable-diffusion-webui.md) (163,743 ⭐) — Stable Diffusion Web UI is a browser-based interface designed for managing text-to-image generation tasks. It provides a centralized dashboard for controlling generative processes, including native support for multi-stage model architectures to facilitate high-quality image refinement.

The platform distinguishes itself through granular control over the generation process, offering tools for precise parameter management and advanced prompt engineering. Users can customize generation styles and capabilities by integrating external model-extension formats, such as textual inversions, low-rank ad
- [openai/gpt-3](https://awesome-repositories.com/repository/openai-gpt-3.md) (15,740 ⭐) — This project is a large language model and general purpose natural language processing engine designed for text generation and linguistic analysis. It functions as a few-shot learning framework capable of solving diverse reasoning and language tasks using a small number of provided examples without requiring additional training.

The system specializes in generating human-like synthetic text and long-form content, including news articles. It also provides capabilities for automated text reasoning to solve logic and arithmetic problems through direct interaction.

The project includes tools for
- [nirdiamant/genai_agents](https://awesome-repositories.com/repository/nirdiamant-genai-agents.md) (20,047 ⭐) — GenAI_Agents is a development framework and orchestration engine designed for building autonomous, multi-agent systems. It provides the infrastructure to construct complex, state-managed workflows where specialized agents collaborate to execute multi-step tasks, manage long-term memory, and perform iterative reasoning.

The platform distinguishes itself through its graph-based orchestration model, which allows developers to define intricate agentic processes with explicit state transitions. It supports advanced control mechanisms such as human-in-the-loop intervention for manual oversight and
- [researchmm/mm-diffusion](https://awesome-repositories.com/repository/researchmm-mm-diffusion.md) (453 ⭐) — [CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
- [hojonathanho/diffusion](https://awesome-repositories.com/repository/hojonathanho-diffusion.md) (5,053 ⭐) — This project is a diffusion model training framework and image synthesis pipeline. It provides the tools necessary to train generative models to learn image data distributions through an iterative denoising process.

The framework includes a generative model evaluation tool consisting of automated scripts used to measure the quality and accuracy of produced samples.

The system covers model training pipelines and performance evaluation for generative diffusion models.
- [pytorch/vision](https://awesome-repositories.com/repository/pytorch-vision.md) (17,743 ⭐) — This project is a comprehensive computer vision library for the PyTorch ecosystem, providing a standardized collection of neural network architectures, datasets, and high-performance transformation utilities. It serves as a foundational framework for building, training, and deploying deep learning models, offering a centralized model registry that allows developers to instantiate architectures with pre-trained weights for tasks such as image classification, object detection, and semantic segmentation.

The library distinguishes itself through its modular approach to data and compute management
- [abhinavxd/libredesk](https://awesome-repositories.com/repository/abhinavxd-libredesk.md) (2,571 ⭐) — Libredesk is an omnichannel support management system designed to unify live chat and email communications into a single dashboard. It provides a comprehensive environment for managing customer interactions, agent roles, and team assignments to organize support workloads.

The project distinguishes itself through AI customer support automation, which includes generating automated responses and refining message tones. It also supports the development and integration of custom chat widgets using WebSockets and JavaScript APIs.

The system covers a broad set of capabilities, including customer re
- [bjing2016/subspace-diffusion](https://awesome-repositories.com/repository/bjing2016-subspace-diffusion.md) (135 ⭐) — Improved diffusion generative models with subspaces
- [google/trax](https://awesome-repositories.com/repository/google-trax.md) (8,304 ⭐) — Trax is a deep learning framework and hardware-agnostic tensor engine designed for designing and training neural networks. It serves as a research tool providing high-level combinators for composing complex architectures, alongside a dedicated library for building transformer models and a toolkit for reinforcement learning.

The framework is distinguished by its support for reversible and sparse transformer architectures, which reduce memory and computational overhead. It enables a single set of model instructions to execute across different hardware backends without changing the underlying co
- [cockroachdb/cockroach](https://awesome-repositories.com/repository/cockroachdb-cockroach.md) (32,207 ⭐) — Cockroach is a distributed SQL database designed to scale horizontally across multiple nodes while maintaining strict ACID compliance and global data consistency. It functions as a relational database engine that automatically partitions data into ranges, rebalancing them across a cluster to accommodate growing storage and throughput requirements. By utilizing a distributed consensus protocol, the system ensures that all nodes agree on the order of operations, providing fault tolerance and continuous availability even in the event of hardware failures.

The system distinguishes itself through
- [capacitor-community/media](https://awesome-repositories.com/repository/capacitor-community-media.md) (133 ⭐) — Capacitor Media @capacitor-community/media Capacitor plugin for saving and retrieving photos and videos, and managing photo albums.
- [paddlepaddle/paddlegan](https://awesome-repositories.com/repository/paddlepaddle-paddlegan.md) (8,043 ⭐) — PaddleGAN is a generative AI framework and deep learning computer vision library built on the PaddlePaddle framework. It serves as a toolkit for image and video synthesis, providing a collection of generative adversarial network implementations for creating synthetic visual content.

The library focuses on advanced synthesis capabilities, including the generation of talking heads through lip motion synchronization and the creation of synthetic videos via motion transfer from driving sequences. It provides tools for domain-to-domain translation, allowing for image style transfer and the transfo
- [atlassian/react-beautiful-dnd](https://awesome-repositories.com/repository/atlassian-react-beautiful-dnd.md) (34,049 ⭐) — This project is a declarative drag-and-drop library designed for building accessible and fluid interface interactions within web applications. It provides a component-based interface for managing complex list reordering and spatial relationships between elements, utilizing a specialized state container to coordinate movement logic.

The library distinguishes itself through a focus on accessibility, maintaining a live connection between visual drag states and the browser accessibility tree to support screen readers and keyboard navigation. It optimizes performance by bypassing standard componen
- [machelreid/diffuser](https://awesome-repositories.com/repository/machelreid-diffuser.md) (55 ⭐) — DiffusER: Discrete Diffusion via Edit-based Reconstruction (Reid, Hellendoorn &amp; Neubig, 2022)
- [sillytavern/sillytavern](https://awesome-repositories.com/repository/sillytavern-sillytavern.md) (29,463 ⭐) — SillyTavern is a comprehensive interface and orchestration platform designed for immersive AI roleplay and interactive chat experiences. It functions as a unified gateway that connects users to a wide array of local and cloud-based large language models, providing a centralized environment to manage complex character personas, narrative context, and model-driven interactions.

The platform distinguishes itself through its advanced prompt engineering and automation capabilities. It utilizes a sophisticated macro-based templating engine and vector-database retrieval to dynamically inject lore, c
- [ersatztv/ersatztv](https://awesome-repositories.com/repository/ersatztv-ersatztv.md) (2,539 ⭐) — ErsatzTV is an IPTV channel simulator and linear media scheduler that transforms personal media libraries into simulated live television channels. It acts as a bridge and transcoding gateway, importing content from external media servers to create virtual broadcast channels delivered via M3U playlists and XMLTV program guides.

The system distinguishes itself through a complex automation engine for linear playback, using collection-based sequencing and dynamic schedule resolution to emulate a traditional broadcast experience. It supports advanced playback logic, including the grouping of multi
- [scenediffuser/scene-diffuser](https://awesome-repositories.com/repository/scenediffuser-scene-diffuser.md) (407 ⭐) — Official implementation of CVPR23 paper "Diffusion-based Generation, Optimization, and Planning in 3D Scenes"
- [nvlabs/stylegan](https://awesome-repositories.com/repository/nvlabs-stylegan.md) (14,412 ⭐) — StyleGAN is a TensorFlow-based generative adversarial network framework designed for the synthesis of high-resolution synthetic imagery. It utilizes a style-based generator architecture to create realistic visual assets from latent vectors, focusing on the production of high-fidelity images.

The system incorporates style mixing and stochastic noise injection to control visual attributes and fine-grained details. It uses adaptive instance normalization and progressive resolution upsampling to manage image quality and variety across different resolutions.

The framework covers the full lifecycl
- [lemmynet/lemmy](https://awesome-repositories.com/repository/lemmynet-lemmy.md) (14,454 ⭐) — Lemmy is a self-hosted, federated discussion platform that enables the operation of independent, decentralized social networking servers. By implementing the ActivityPub protocol, it allows autonomous instances to exchange content, synchronize user interactions, and participate in a global, distributed network without centralized control.

The platform distinguishes itself through a decoupled architecture that separates the backend API from the frontend, facilitating the development of custom interfaces while maintaining unified user handles and cross-platform communication. It provides granul
- [cs231n/cs231n.github.io](https://awesome-repositories.com/repository/cs231n-cs231n-github-io.md) (10,923 ⭐) — This project is a static educational website and comprehensive curriculum focused on computer vision and deep learning. It serves as a public repository of instructional materials, lecture notes, and technical guides specifically detailing convolutional neural networks and visual recognition.

The site is developed using static-site generation to host course documentation and student project directories. It provides structured academic resources that guide learners through image classification, generative modeling, and the implementation of various neural network architectures.

The curriculum
- [jeremiastraub/diffusion](https://awesome-repositories.com/repository/jeremiastraub-diffusion.md) (33 ⭐) — Representation Learning with Diffusion Models
- [sunlin-ai/diffusion_tutorial](https://awesome-repositories.com/repository/sunlin-ai-diffusion-tutorial.md) (211 ⭐) — diffusion generative model
- [nielsrogge/transformers-tutorials](https://awesome-repositories.com/repository/nielsrogge-transformers-tutorials.md) (11,641 ⭐) — This is a collection of tutorials and practical demonstrations for implementing machine learning tasks using the HuggingFace Transformers library. It serves as a guide for applying transformer architectures across computer vision, natural language processing, and audio analysis.

The repository provides implementation examples for multimodal model deployment, including the combination of text, image, and audio inputs. It includes resources for optimizing pre-trained models through fine-tuning on custom datasets and provides examples for preparing PyTorch datasets by converting raw files into t
- [drizzle-team/drizzle-orm](https://awesome-repositories.com/repository/drizzle-team-drizzle-orm.md) (34,835 ⭐) — Drizzle ORM is a TypeScript-native database toolkit providing type-safe SQL query building, schema management, and automated migrations across PostgreSQL, MySQL, SQLite, and SingleStore.
- [nativeinstruments/ni-media](https://awesome-repositories.com/repository/nativeinstruments-ni-media.md) (254 ⭐) — NI Media is a C++ library for reading and writing audio streams.
- [fetchai/innovation-lab-examples](https://awesome-repositories.com/repository/fetchai-innovation-lab-examples.md) (1,028 ⭐) — This project provides a comprehensive framework for building, deploying, and orchestrating autonomous agents within a decentralized network. It serves as a collection of patterns and examples for developing intelligent software entities capable of performing complex tasks, making decisions, and interacting with other agents to achieve shared goals.

The framework distinguishes itself through its focus on multi-agent orchestration and decentralized communication. It enables the coordination of specialized agent teams that collaborate on workflows through structured messaging protocols, allowing
- [hzfinfdu/diffusion-bert](https://awesome-repositories.com/repository/hzfinfdu-diffusion-bert.md) (343 ⭐) — Official implementation of DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models.
- [aliaksandrsiarohin/first-order-model](https://awesome-repositories.com/repository/aliaksandrsiarohin-first-order-model.md) (15,003 ⭐) — This project is a generative adversarial network designed for image animation and motion transfer. It functions as a computer vision framework that synthesizes video sequences by applying motion patterns extracted from a driving video onto a static source image.

The model distinguishes itself by using a keypoint-based representation to decouple object appearance from temporal movement. By tracking structural deformations through learned latent coordinates, it performs motion retargeting and synthetic media production without requiring manual annotations or object-specific training data.

The
- [gohugoio/hugo](https://awesome-repositories.com/repository/gohugoio-hugo.md) (88,701 ⭐) — Hugo is a high-performance static site generator that transforms source content and templates into optimized web assets. Built with a focus on speed and scalability, it provides a comprehensive framework for managing large-scale documentation and editorial projects through structured content organization, taxonomies, and a flexible template-driven rendering engine.

The project distinguishes itself through a sophisticated build system that utilizes incremental caching to minimize redundant processing during site updates. It supports complex content requirements by enabling multidimensional mod
- [ankitects/anki](https://awesome-repositories.com/repository/ankitects-anki.md) (28,571 ⭐) — Anki is a cross-platform flashcard management system designed to optimize long-term memory retention through spaced-repetition learning. It functions as a digital learning assistant that uses active recall practice and automated scheduling algorithms to determine the ideal timing for card reviews based on individual performance history. The core system relies on a local relational database to ensure data persistence and portability, while supporting complex study workflows through flexible note-type schema modeling and template-driven content rendering.

The platform distinguishes itself throu
- [sndnyang/diffusion_vit](https://awesome-repositories.com/repository/sndnyang-diffusion-vit.md) (48 ⭐) — PyTorch Implementation of "Your ViT is Secretly a Hybrid Discriminative-Generative Diffusion Model"
- [steven2358/awesome-generative-ai](https://awesome-repositories.com/repository/steven2358-awesome-generative-ai.md) (12,151 ⭐) — This project serves as a comprehensive, curated directory of resources, tools, and platforms dedicated to the generative artificial intelligence ecosystem. It functions as a central hub for developers and researchers to discover the frameworks, models, and services necessary for building, deploying, and managing intelligent software applications.

The directory distinguishes itself by providing a structured index of specialized tooling across several technical domains. It covers the full lifecycle of generative AI, including the development of autonomous agent systems, the implementation of re
- [jannerm/diffuser](https://awesome-repositories.com/repository/jannerm-diffuser.md) (1,284 ⭐) — Code for the paper "Planning with Diffusion for Flexible Behavior Synthesis"
- [lazyprogrammer/machine_learning_examples](https://awesome-repositories.com/repository/lazyprogrammer-machine-learning-examples.md) (8,823 ⭐) — This project is a comprehensive collection of practical code examples and implementation libraries for machine learning. It provides a wide array of reference materials for building supervised, unsupervised, and reinforcement learning algorithms.

The repository serves as a multi-domain resource, featuring specific implementation suites for financial AI, Bayesian statistical modeling, and deep learning architectures. It includes a framework for training intelligent agents using policy gradients and actor-critic models, as well as practical guides for fine-tuning transformers and utilizing larg
- [fmhy/fmhy](https://awesome-repositories.com/repository/fmhy-fmhy.md) (13,150 ⭐) — FMHY is a community-driven index designed to organize and distribute decentralized digital content through standardized metadata and protocol-agnostic linking. It functions as a resilient, distributed map of internet resources, providing a structured directory that facilitates the discovery of media, software, and educational tools without reliance on centralized control.

The project distinguishes itself by maintaining a massive, human-verified repository of external links that span diverse digital ecosystems, including peer-to-peer networks, Usenet, and direct download servers. By utilizing
- [getgrav/grav](https://awesome-repositories.com/repository/getgrav-grav.md) (15,395 ⭐) — Grav is a flat-file content management system that eliminates the need for a traditional database by storing site content and configuration in human-readable Markdown and YAML files. Built as a modular PHP web framework, it uses a hierarchical page routing system where the physical directory structure directly determines the site's URL paths.

The platform is distinguished by its event-driven plugin architecture and a command-line interface that prioritizes system administration, deployment, and maintenance tasks. It utilizes a blueprint-driven system to generate administrative forms from stru
- [openai/guided-diffusion](https://awesome-repositories.com/repository/openai-guided-diffusion.md) (7,395 ⭐) — This is a classifier-guided diffusion framework for high-fidelity image generation. It implements a cascaded diffusion pipeline that chains a base diffusion model with a dedicated upsampler to progressively increase image resolution in stages, and uses classifier-guided diffusion sampling to steer the reverse diffusion process toward higher-quality outputs.

The framework provides tools for training diffusion models from scratch using distributed processes with gradient accumulation, as well as training classifier models that provide gradient-based guidance during sampling. It supports both un
- [morvanzhou/pytorch-tutorial](https://awesome-repositories.com/repository/morvanzhou-pytorch-tutorial.md) (8,458 ⭐) — This project is a collection of PyTorch learning resources and educational guides designed to teach the construction and training of neural networks. It serves as a comprehensive deep learning tutorial covering various model architectures and practical implementation strategies.

The resources provide specific guidance on implementing computer vision tasks, such as image classification and synthetic imagery generation, as well as reinforcement learning agents using value networks and experience replay. It also covers sequential data modeling through recurrent networks and generative modeling u
- [jujumilk3/leaked-system-prompts](https://awesome-repositories.com/repository/jujumilk3-leaked-system-prompts.md) (14,134 ⭐) — This project is a research-oriented repository that serves as a centralized database for system-level prompts and internal behavioral instructions extracted from various large language models. Its primary purpose is to provide a transparent, accessible reference for researchers and developers to study how artificial intelligence models are configured, constrained, and governed.

The repository distinguishes itself by cataloging the hidden directives and operational guidelines that define model personas and safety boundaries. By archiving these instruction sets, it enables comparative analysis
- [archinetai/audio-diffusion-pytorch](https://awesome-repositories.com/repository/archinetai-audio-diffusion-pytorch.md) (2,100 ⭐) — Audio generation using diffusion models, in PyTorch.
- [pytorch/examples](https://awesome-repositories.com/repository/pytorch-examples.md) (23,752 ⭐) — This repository serves as a comprehensive collection of reference implementations for the PyTorch machine learning library. It provides practical examples for building, training, and deploying deep learning models, functioning as a toolkit for developers to explore neural network architectures and training workflows.

The project distinguishes itself by offering concrete demonstrations of complex machine learning operations, ranging from computer vision tasks like object detection and depth estimation to the training of large-scale transformer models. These examples illustrate how to implement
- [hakimel/reveal.js](https://awesome-repositories.com/repository/hakimel-reveal-js.md) (71,731 ⭐) — This project is a web-native presentation framework that renders slide decks from standard HTML or Markdown. It functions as a declarative slide engine, managing navigation, state persistence, and lifecycle events through a configuration-driven interface. By leveraging standard web technologies, it enables the creation of responsive, browser-based presentations that support complex layouts, nested transitions, and interactive content.

The framework distinguishes itself through a modular, plugin-based architecture that allows developers to extend core functionality using custom hooks and event
- [xiangli1999/diffusion-lm](https://awesome-repositories.com/repository/xiangli1999-diffusion-lm.md) (1,241 ⭐) — Diffusion-LM