# datawhalechina/tiny-universe

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/datawhalechina-tiny-universe).**

4,505 stars · 441 forks · Jupyter Notebook

## Links

- GitHub: https://github.com/datawhalechina/tiny-universe
- awesome-repositories: https://awesome-repositories.com/repository/datawhalechina-tiny-universe.md

## Topics

`agent` `diffusion` `evaluation-metrics` `llama` `qwen` `rag` `transformers`

## Description

Tiny Universe is an educational monorepo that delivers multiple independent implementations of core AI subsystems as self-contained Jupyter notebooks. It provides from-scratch constructions of foundational architectures including a complete Transformer model built from the original paper specification, a denoising diffusion probabilistic model for image generation, and a ReAct-style autonomous agent framework that equips an LLM with tools for planning and multi-step task execution.

The project distinguishes itself by covering the full lifecycle of modern AI systems through hands-on implementations. It includes retrieval-augmented generation pipelines that combine vector databases with knowledge graphs, a GraphRAG system that constructs knowledge graphs from text and generates hierarchical community summaries, and a two-stage evaluation pipeline that scores model outputs against reference answers using metrics like F1, ROUGE, and accuracy. The repository also demonstrates reinforcement learning fine-tuning, automated document review workflows that detect deviations and generate revision suggestions, and iterative image optimization that evaluates and improves generated images against text prompts.

Beyond these core areas, Tiny Universe explores the internal mechanisms of large language models with walkthroughs of grouped query attention, rotary position embeddings, and causal masking. It covers data processing techniques such as semantic chunking by sentence shifts, vector embedding pipelines for similarity-based retrieval, and hybrid search strategies that fuse sentence-level similarity with domain-specific term importance. The project also includes image quality evaluation using Inception Score and Fréchet Inception Distance, as well as image-text consistency checking with vision-language models.

All implementations are delivered as self-contained Jupyter notebooks within a single repository, making the code directly runnable and inspectable for educational purposes.

## Tags

### Artificial Intelligence & ML

- [Transformer Architecture Implementation](https://awesome-repositories.com/f/artificial-intelligence-ml/transformer-architecture-implementation.md) — Recreates the full Transformer architecture from the original paper using PyTorch. ([source](https://cdn.jsdelivr.net/gh/datawhalechina/tiny-universe@main/README.md))
- [Gated Activation Computations](https://awesome-repositories.com/f/artificial-intelligence-ml/activation-functions/gated-linear-units/gated-activation-computations.md) — Provides a from-scratch implementation of gated MLP activation computation for transformer models. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/Qwen-blog))
- [Agentic Reasoning Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-reasoning-frameworks.md) — Provides a ReAct agent framework for planning multi-step tasks and executing external tool APIs.
- [Agentic Reasoning Loops](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-reasoning-loops.md) — Implements a reasoning-acting cycle where the model iteratively generates thoughts and selects tools.
- [Autonomous Agent Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/autonomous-agent-frameworks.md) — Implements a ReAct-style agent that plans tasks and executes external tool APIs. ([source](https://cdn.jsdelivr.net/gh/datawhalechina/tiny-universe@main/README.md))
- [Decoder Architectures](https://awesome-repositories.com/f/artificial-intelligence-ml/decoder-architectures.md) — Constructs a decoder-only transformer from scratch with stacked self-attention and feed-forward layers. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/Qwen-blog))
- [Diffusion Model Training](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-models/diffusion-models/diffusion-model-training.md) — Provides a minimal DDPM implementation translating mathematical formulas into training and sampling code.
- [Text-to-Image Generators](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-pipelines/text-to-image-generators.md) — Generates images from text descriptions using a Stable Diffusion XL model. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyIMGRAG))
- [Graph Retrieval Augmented Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/graph-retrieval-augmented-generation.md) — Provides a retrieval-augmented generation system combining vector databases with knowledge graphs for associative retrieval.
- [Hyperparameter Configurations](https://awesome-repositories.com/f/artificial-intelligence-ml/hyperparameter-configurations.md) — Configures model hyperparameters like vocabulary size, hidden dimensions, and layer counts. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/Qwen-blog))
- [RAG Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/language-model-orchestration/retrieval-augmented-generation/rag-pipelines.md) — Constructs a RAG system from the ground up performing document retrieval and knowledge-grounded text generation. ([source](https://cdn.jsdelivr.net/gh/datawhalechina/tiny-universe@main/README.md))
- [LLM Architecture Explainers](https://awesome-repositories.com/f/artificial-intelligence-ml/llm-architecture-explainers.md) — Provides a detailed walkthrough of Qwen2 internals including GQA, RoPE, and attention masks. ([source](https://cdn.jsdelivr.net/gh/datawhalechina/tiny-universe@main/README.md))
- [Normalization Layers](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/frameworks/model-construction/neural-network-layers/normalization-layers.md) — Implements layer normalization from scratch to stabilize training in deep neural networks. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyTransformer))
- [From-Scratch](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/model-fine-tuning-adaptation/language-model-training/from-scratch.md) — Builds and trains transformer-based language models from scratch using only basic frameworks for education. ([source](https://cdn.jsdelivr.net/gh/datawhalechina/tiny-universe@main/README.md))
- [Transformer Encoder Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/speech-processing/sequence-to-sequence-tasks/sequence-encoders/transformer-encoder-implementations.md) — Builds a transformer encoder from scratch to produce contextualized sequence representations. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyTransformer))
- [Triangular Mask Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/masked-language-modeling/causal-masking/triangular-mask-implementations.md) — Implements triangular causal masks from scratch for decoder-only transformer training. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyTransformer))
- [Multi-Head Attention Mechanisms](https://awesome-repositories.com/f/artificial-intelligence-ml/multi-head-attention-mechanisms.md) — Implements multi-head attention with parallel heads and concatenated outputs from scratch. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyTransformer))
- [Grouped-Query Attention](https://awesome-repositories.com/f/artificial-intelligence-ml/multi-head-attention-mechanisms/grouped-query-attention.md) — Implements grouped-query attention from scratch to reduce KV-cache memory in transformer models. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/Qwen-blog))
- [Rotary Positional Embeddings](https://awesome-repositories.com/f/artificial-intelligence-ml/positional-embedding-techniques/rotary-positional-embeddings.md) — Applies rotary position embeddings from scratch to encode token positions in transformer models. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/Qwen-blog))
- [Residual Connection Implementations](https://awesome-repositories.com/f/artificial-intelligence-ml/residual-networks/residual-connection-implementations.md) — Implements residual connections from scratch to enable training of deep transformer models. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyTransformer))
- [RMS Normalizations](https://awesome-repositories.com/f/artificial-intelligence-ml/rms-normalizations.md) — Applies RMSNorm normalization from scratch to stabilize training in transformer models. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/Qwen-blog))
- [Sequence Decoders](https://awesome-repositories.com/f/artificial-intelligence-ml/sequence-decoding-models/sequence-decoders.md) — Implements a transformer decoder that generates output sequences using masked self-attention and cross-attention. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyTransformer))
- [Autoregressive Text Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/sequence-generation/autoregressive-text-generation.md) — Produces output sequences one token at a time by feeding previous outputs back into the decoder. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyTransformer))
- [Attention Scoring Functions](https://awesome-repositories.com/f/artificial-intelligence-ml/attention-scoring-functions.md) — Computes relevance between queries and keys using scaled dot-product attention. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyTransformer))
- [Custom Dataset Evaluators](https://awesome-repositories.com/f/artificial-intelligence-ml/custom-evaluation-judges/custom-dataset-evaluators.md) — Accepts user-provided SFT-format datasets to run inference and scoring on custom tasks. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyEval))
- [F1 Metric Scorers](https://awesome-repositories.com/f/artificial-intelligence-ml/evaluation-metrics/scoring-pipelines/f1-metric-scorers.md) — Computes F1 scores from tokenized text overlap between generated and reference answers. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyEval))
- [Grounded Answer Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/generative-ai/grounded-answer-generation.md) — Feeds retrieved document segments to an LLM to produce grounded answers with citations. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyRAG))
- [Diffusion Model Evaluators](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-content-apis/quality-evaluators/diffusion-model-evaluators.md) — Evaluates generated image quality using Inception Score and Fréchet Inception Distance. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyDiffusion))
- [Image-Text Consistency Checkers](https://awesome-repositories.com/f/artificial-intelligence-ml/image-retrieval-systems/text-to-image-retrieval/image-text-match-ranking/image-text-consistency-checkers.md) — Checks image-text consistency using a vision-language model to identify prompt mismatches. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyIMGRAG))
- [Reinforcement Learning Fine-Tuning](https://awesome-repositories.com/f/artificial-intelligence-ml/language-model-fine-tuning/reinforcement-learning-fine-tuning.md) — Implements reinforcement learning algorithms to fine-tune language models based on reward signals. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyRL_LLM))
- [Document Chunking & Embedding](https://awesome-repositories.com/f/artificial-intelligence-ml/language-model-orchestration/retrieval-augmented-generation/rag-pipelines/document-chunking-embedding.md) — Splits documents into fixed-length segments with configurable overlap for retrieval pipelines. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyRAG))
- [Large Language Model Integration](https://awesome-repositories.com/f/artificial-intelligence-ml/large-language-models/large-language-model-integration.md) — Provides a unified interface for calling LLMs for text cleaning, extraction, or generation. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/CDDRS))
- [Causal Masking](https://awesome-repositories.com/f/artificial-intelligence-ml/masked-language-modeling/causal-masking.md) — Applies lower-triangular masks so each token can only attend to itself and preceding tokens. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/Qwen-blog))
- [Dual-Engine Evaluation Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/model-evaluation-tools/evaluator-model-configurators/dual-engine-evaluation-pipelines.md) — Ships a two-stage inference-and-evaluation pipeline scoring outputs with F1, ROUGE, and accuracy. ([source](https://cdn.jsdelivr.net/gh/datawhalechina/tiny-universe@main/README.md))
- [Accuracy Calculators](https://awesome-repositories.com/f/artificial-intelligence-ml/prediction-visualization/accuracy-calculators.md) — Calculates accuracy as the proportion of exact matches between predicted and correct answers. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyEval))
- [Text Embeddings](https://awesome-repositories.com/f/artificial-intelligence-ml/text-to-numeric-transformations/text-embeddings.md) — Converts document text into dense vector representations for similarity-based retrieval. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/CDDRS))
- [Vector Embeddings](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings.md) — Converts text documents into vector representations using pluggable embedding models. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyRAG))

### Part of an Awesome List

- [Jupyter Notebook Collections](https://awesome-repositories.com/f/awesome-lists/learning/jupyter-notebook-collections.md) — Delivers multiple independent AI subsystem implementations as self-contained Jupyter notebooks.
- [Self-Attention Implementations](https://awesome-repositories.com/f/awesome-lists/ai/attention-mechanisms/self-attention-implementations.md) — Implements self-attention from scratch where queries, keys, and values derive from the same input. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyTransformer))
- [Autonomous Task Agents](https://awesome-repositories.com/f/awesome-lists/ai/autonomous-task-agents.md) — Creates autonomous agents that use reasoning and acting patterns to plan tasks and execute APIs.
- [ReAct Deployments](https://awesome-repositories.com/f/awesome-lists/ai/conversational-agents/react-deployments.md) — Constructs a ReAct-style agent that reasons, selects tools, and integrates results into responses. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyAgent))
- [Iterative Noise Removal](https://awesome-repositories.com/f/awesome-lists/ai/image-reconstruction/iterative-noise-removal.md) — Generates new images by iteratively denoising random Gaussian noise using a trained diffusion model. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyDiffusion))
- [Diffusion Models](https://awesome-repositories.com/f/awesome-lists/ai/diffusion-models.md) — Implements a denoising diffusion probabilistic model for image generation from pure noise.

### Data & Databases

- [Knowledge Graph Construction Tools](https://awesome-repositories.com/f/data-databases/knowledge-graph-construction-tools.md) — Extracts entities and relationships from unstructured documents using LLMs and stores them as graph structures. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyGraphRAG))
- [Hierarchical Community Clustering](https://awesome-repositories.com/f/data-databases/anomaly-detection/graph-community-detection/hierarchical-community-clustering.md) — Partitions knowledge graphs into nested communities and generates LLM summaries for each level.
- [Community Summarizations](https://awesome-repositories.com/f/data-databases/anomaly-detection/graph-community-detection/hierarchical-community-clustering/community-summarizations.md) — Generates textual summaries for each community by feeding nodes and edges to an LLM. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyGraphRAG))
- [Entity Resolution](https://awesome-repositories.com/f/data-databases/entity-resolution.md) — Identifies and merges multiple references to the same real-world entity using LLM comparison. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyGraphRAG))
- [Hybrid Vector-Graph Databases](https://awesome-repositories.com/f/data-databases/hybrid-vector-graph-databases.md) — Combines a Neo4j knowledge graph with a vector database for associative and similarity-based retrieval.
- [Vector Storage](https://awesome-repositories.com/f/data-databases/local-first-storage/vector-storage.md) — Persists document vectors locally and retrieves relevant segments via cosine similarity. ([source](https://github.com/datawhalechina/tiny-universe/tree/main/content/TinyRAG))

### Education & Learning Resources

- [GraphRAG Integrations](https://awesome-repositories.com/f/education-learning-resources/ai-development-curricula/graphrag-integrations.md) — Constructs a minimal GraphRAG pipeline integrating graph construction, retrieval, reasoning, and generation. ([source](https://cdn.jsdelivr.net/gh/datawhalechina/tiny-universe@main/README.md))
- [Educational Implementations](https://awesome-repositories.com/f/education-learning-resources/educational-resources/systems-applied-computing/machine-learning-education/llm-engineering-guides/transformer-model-tutorials/educational-implementations.md) — Offers from-scratch implementations of transformer architectures and language model internals for learning.

### Scientific & Mathematical Computing

- [Transformer Implementations](https://awesome-repositories.com/f/scientific-mathematical-computing/from-scratch-implementations/transformer-implementations.md) — Builds a complete Transformer model from the original paper specification using only PyTorch.

### Software Engineering & Architecture

- [Evaluation Pipelines](https://awesome-repositories.com/f/software-engineering-architecture/training-pipelines/two-stage/evaluation-pipelines.md) — Runs model inference on a dataset then scores outputs against reference answers using F1, ROUGE, and accuracy.

### Testing & Quality Assurance

- [LLM Evaluation](https://awesome-repositories.com/f/testing-quality-assurance/model-testing/llm-evaluation.md) — Provides quantitative evaluation tools measuring generative quality with F1, ROUGE, and Inception Score.
