# ml-gsai/llada

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/ml-gsai-llada).**

3,580 stars · 240 forks · Python · mit

## Links

- GitHub: https://github.com/ML-GSAI/LLaDA
- awesome-repositories: https://awesome-repositories.com/repository/ml-gsai-llada.md

## Description

LLaDA is a masked diffusion language model and conditional text generator. It generates text by iteratively refining masked tokens through a diffusion process rather than predicting the next token in a sequence.

The project functions as a vision-language diffusion model, converting visual inputs into text responses. It also serves as a preference optimization framework that uses log-likelihood estimation and evidence lower bounds to tune model responses.

The system supports multi-round conversational AI and text sequence evaluation. It integrates vision-language embedding for cross-modal conditioning and uses iterative token refinement to produce text.

## Tags

### Part of an Awesome List

- [Masked Text Diffusion](https://awesome-repositories.com/f/awesome-lists/ai/diffusion-models/masked-text-diffusion.md) — Implements a masked diffusion architecture that iteratively refines tokens into final text.
- [Multimodal Diffusion Models](https://awesome-repositories.com/f/awesome-lists/ai/multimodal-diffusion-models.md) — A system that converts visual inputs into text responses using a diffusion process for complex multimodal tasks.
- [Discrete Diffusion Models](https://awesome-repositories.com/f/awesome-lists/ai/discrete-diffusion-models.md) — Utilizes a discrete state space and categorical distributions for token transitions during diffusion.
- [Conditional Masked Generators](https://awesome-repositories.com/f/awesome-lists/ai/sequence-to-sequence-models/text-sequence-generators/conditional-masked-generators.md) — Produces conditional text sequences by iteratively filling in missing tokens through masking.
- [Language Diffusion Models](https://awesome-repositories.com/f/awesome-lists/ai/language-diffusion-models.md) — Large language diffusion models for text generation.

### Artificial Intelligence & ML

- [Vision-Language Cross-Attention Fusions](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-architectures/cross-attention-mechanisms/vision-language-cross-attention-fusions.md) — Fuses visual encoder features with text embeddings to enable image-conditioned text generation.
- [Cross-Attention Conditioning](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-pipelines/text-to-video-generators/cross-attention-conditioning.md) — Implements cross-attention mechanisms to inject visual and textual context into the diffusion denoising process.
- [Iterative Refinement Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/generative-ai/generative-text-inference/iterative-refinement-generation.md) — Generates text by repeatedly updating and refining masked sequences until convergence.
- [Vision-Language Models](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/multimodal-processing-tools/vision-language-models.md) — Integrates visual and linguistic processing to describe images and answer questions.
- [Diffusion Masked Language Models](https://awesome-repositories.com/f/artificial-intelligence-ml/masked-language-modeling/diffusion-masked-language-models.md) — Generates text by iteratively refining masked tokens via a diffusion process.
- [Preference Optimization](https://awesome-repositories.com/f/artificial-intelligence-ml/preference-optimization.md) — Refines model responses based on human preferences using log-likelihood and evidence lower bounds.
- [Visual-to-Text Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/visual-to-text-generation.md) — Converts visual inputs into text responses using a diffusion process for multimodal tasks. ([source](https://ml-gsai.github.io/LLaDA-V-demo/))
- [Variational Lower Bound Estimations](https://awesome-repositories.com/f/artificial-intelligence-ml/variational-lower-bound-estimations.md) — Calculates likelihoods via a variational lower bound to optimize model responses for specific preferences.

### Scientific & Mathematical Computing

- [Sequence Likelihood Estimators](https://awesome-repositories.com/f/scientific-mathematical-computing/numerical-mathematical-foundations/statistics-probability/probability-distributions/joint-probability-calculators/sequence-likelihood-estimators.md) — Calculates probability estimates of sequences using evidence lower bounds for preference optimization. ([source](https://ml-gsai.github.io/LLaDA-1.5-Demo/))

### User Interface & Experience

- [Sequence Likelihood Scores](https://awesome-repositories.com/f/user-interface-experience/visitor-identification/confidence-scoring/sequence-likelihood-scores.md) — Measures the log-likelihood of text sequences to evaluate model predictive accuracy.
