30 open-source projects similar to bytedance-seed/stable-diffcoder, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Stable DiffCoder alternative.
This thesis aimstoinvestigate thepotential of discrete diffusion models in the context ofnaturallanguagegeneration.
Dream-Coder 7B is a diffusion LLM for code trained exclusively on open-source data across its development stages—adaptation, supervised fine-tuning, and reinforcement learning. It achieves an impressive 21.4% pass@1 on LiveCodeBench (2410-2505), outperforming other open-source diffusion LLMs by…
By Dimitri von Rütte, Janis Fluri, Yuhui Ding, Antonio Orvieto, Bernhard Schölkopf, Thomas Hofmann
please use requirementsgpu.txt if your accelerator is GPUs, use requirementstpu.txt when using Google Cloud TPUs.
Current Diffusion Language Models (DLMs) have been studied at a smaller scale compared to their autoregressive (AR) counterparts and lack fair comparison on language modeling benchmarks. Additionally, training diffusion models from scratch at scale remains challenging. We propose adapting…
This repository contains code for training and evaluating the models in the paper Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning.
](https://huggingface.co/Dream-org/Dream-v0-Base-7B)
This repository contains the official implementation of paper A Reparameterized Discrete Diffusion Model for Text Generation.
Official implementation of DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models.
This repository contains code for training and evaluating the models in the paper Likelihood-Based Diffusion Language Models.
SparseD is a novel sparse attention method for diffusion language models (DLMs), delivering near lossless acceleration in performance. It uses full attention and computes sparse patterns during early denoising steps, then reuses these patterns in later steps to restrict computation and improve…
We introduce SDAR (Synergy of Diffusion and AutoRegression), a large-scale diffusion language model that unites the complementary strengths of autoregressive and discrete diffusion modeling. By merging the training efficiency of autoregressive methods with the highly parallel decoding ability of…
Diffusion Language Models are Super Data Learners
Training Optimal Large Diffusion Language Models Jinjie Ni†, Qian Liu, Chao Du, Longxu Dou, Hang Yan, Zili Wang, Tianyu Pang, Michael Qizhe Shieh
This repository contains the official implementation for the paper "C²DLM: Causal Concept-Guided Diffusion Large Language Models".
By Marianne Arriola, Aaron Gokaslan, Justin T Chiu, Zhihan Yang, Zhixuan Qi, Jiaqi Han, Subham Sekhar Sahoo, Volodymyr Kuleshov
By Subham Sekhar Sahoo, Marianne Arriola, Yair Schiff, Aaron Gokaslan, Edgar Marroquin, Justin T Chiu, Alexander Rush, Volodymyr Kuleshov
1.we propose a new solution for integrating Pre-trained LM into Diffusion Model, to perform discrete diffusion for text-to-text generation. It requires only low-cost fine-tuning and can performs better than vanilla fine-tuning;
Official PyTorch implementation of the paper "Accelerating Diffusion Large Language Models with SlowFast Sampling: The Three Golden Principles" (Slow Fast Sampling).
The official implementation of "Discrete Copula Diffusion", which was published at ICLR 2025.
Code repository for the paper Think While You Generate: Discrete Diffusion with Planned Denoising, by Sulin Liu, Juno Nam, Andrew Campbell, Hannes Stärk, Yilun Xu, Tommi Jaakkola, Rafael Gómez-Bombarelli. Tweet and video for the main idea.
This repo contains a PyTorch implementation for the paper Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution by Aaron Lou, Chenlin Meng and Stefano Ermon.
2026-04-07 Our Paper is accepted to ACL 2026 (main)! - 2026-01-13 Code of EvoToken-DLM Released! - 2026-01-12 Paper Released!