# datawhalechina/fun-rec

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/datawhalechina-fun-rec).**

7,177 stars · 1,012 forks · Python

## Links

- GitHub: https://github.com/datawhalechina/fun-rec
- awesome-repositories: https://awesome-repositories.com/repository/datawhalechina-fun-rec.md

## Topics

`algorithm-engineering` `deep-learning` `interview-questions` `machine-learning` `recommendation-algorithms` `recommender-system` `tensorflow` `tianchi-competition`

## Description

fun-rec is a learning guide and framework for building personalized recommendation systems, covering everything from deep learning ranking to generative recommendation paradigms. It provides instructional content on constructing industrial-grade architectures that span offline data processing and real-time online serving.

The project distinguishes itself by focusing on generative recommendation, treating the suggestion process as a sequence-to-sequence task using large language models and transformer models to generate item identifiers rather than traditional ranking lists. It also emphasizes strategies for list diversification and the use of diffusion-based data augmentation to improve model robustness.

The system covers the full recommendation pipeline, including candidate retrieval through vector embeddings and collaborative filtering, preference prediction using feature-crossing sequence modeling, and final list optimization via greedy re-ranking algorithms. It also addresses operational challenges such as the cold start problem and the deployment of hybrid offline-online pipelines.

## Tags

### Artificial Intelligence & ML

- [Generative Recommendation Models](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-recommendation-models.md) — Implements a generative paradigm where large language models generate item identifiers directly as a sequence-to-sequence task.
- [Generative](https://awesome-repositories.com/f/artificial-intelligence-ml/recommendation-models/generative.md) — Implements generative recommendation paradigms using LLMs and diffusion models to generate item suggestions directly.
- [Sequence Interaction Modeling](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-interaction-models/sequence-interaction-modeling.md) — Combines historical user behavior with cross-feature interactions to predict the probability of future engagement.
- [Candidate Generation](https://awesome-repositories.com/f/artificial-intelligence-ml/large-scale-training/distributed-dataset-generators/candidate-generation.md) — Optimizes the extraction of potential items from large datasets using vector embeddings and collaborative filtering.
- [Embedding-Based Retrieval](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/recommendation-engines/embedding-based-retrieval.md) — Employs vector representations and similarity searches to efficiently retrieve candidate items from large databases.
- [Recommendation Engine Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/recommendation-engines/recommendation-engine-pipelines.md) — Coordinates the full recommendation lifecycle from feature engineering and model training to real-time online serving. ([source](https://github.com/datawhalechina/fun-rec/tree/master/web_project))
- [Personalized Recommendation Retrieval](https://awesome-repositories.com/f/artificial-intelligence-ml/personalized-recommendation-retrieval.md) — Identifies potential item matches by filtering large datasets through collaborative filtering and embedding-based retrieval. ([source](https://github.com/datawhalechina/fun-rec/blob/master/README_en.md))
- [Preference Prediction](https://awesome-repositories.com/f/artificial-intelligence-ml/preference-prediction.md) — Predicts user preferences and the likelihood of engagement using feature crossing and sequence modeling. ([source](https://github.com/datawhalechina/fun-rec#readme))
- [Recommendation Architectures](https://awesome-repositories.com/f/artificial-intelligence-ml/recommendation-architectures.md) — Provides a framework for constructing personalized content architectures using deep learning, cascade models, and generative paradigms. ([source](https://github.com/datawhalechina/fun-rec/blob/master/pyproject.toml))
- [Generative](https://awesome-repositories.com/f/artificial-intelligence-ml/recommendation-architectures/generative.md) — Provides architectural patterns for replacing traditional ranking lists with end-to-end item generation using transformers.
- [Generative Recommendation Modeling](https://awesome-repositories.com/f/artificial-intelligence-ml/recommendation-models/generative-recommendation-modeling.md) — Generates item suggestions instead of ranking lists using large language models and diffusion models. ([source](https://github.com/datawhalechina/fun-rec#readme))
- [Recommendation Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/recommendation-pipelines.md) — Builds end-to-end infrastructure spanning offline data pipelines, online serving, and deployment operations. ([source](https://github.com/datawhalechina/fun-rec/blob/master/README.md))
- [Recommender Systems](https://awesome-repositories.com/f/artificial-intelligence-ml/recommender-systems.md) — Integrates offline pipelines and online serving processes into a complete industrial-grade recommendation architecture. ([source](https://github.com/datawhalechina/fun-rec#readme))
- [Generative Item Suggestion](https://awesome-repositories.com/f/artificial-intelligence-ml/recommender-systems/recommendation-list-generators/generative-item-suggestion.md) — Creates end-to-end item suggestions using large language models and specialized tokenizers. ([source](https://github.com/datawhalechina/fun-rec/blob/master/README_en.md))
- [Diffusion-Based Augmentations](https://awesome-repositories.com/f/artificial-intelligence-ml/denoising-pre-training-augmentations/diffusion-based-augmentations.md) — Uses diffusion-based noise and denoising processes to enhance training sequences and model robustness. ([source](https://github.com/datawhalechina/fun-rec/blob/master/README.md))
- [Greedy Re-Ranking Algorithms](https://awesome-repositories.com/f/artificial-intelligence-ml/greedy-re-ranking-algorithms.md) — Applies greedy search algorithms to the final item list to maximize diversity and prevent repetitive content.
- [Diversity Optimization Strategies](https://awesome-repositories.com/f/artificial-intelligence-ml/recommendation-engines/recommendation-scoring-and-ranking/diversity-optimization-strategies.md) — Implements methods for optimizing list variety using greedy algorithms and re-ranking models.
- [List Diversification](https://awesome-repositories.com/f/artificial-intelligence-ml/recommendation-engines/recommendation-scoring-and-ranking/list-diversification.md) — Applies re-ranking algorithms to balance relevance with variety and prevent information silos.
- [Result Diversification](https://awesome-repositories.com/f/artificial-intelligence-ml/recommendation-models/slate-recommenders/result-diversification.md) — Prevents information silos by reordering items to break continuous sequences of the same category. ([source](https://github.com/datawhalechina/fun-rec/tree/master/web_project))
- [Cold Start Solvers](https://awesome-repositories.com/f/artificial-intelligence-ml/recommender-systems/cold-start-solvers.md) — Provides relevant suggestions for new users or items by balancing exploration and exploitation algorithms. ([source](https://github.com/datawhalechina/fun-rec/tree/master/web_project))
- [Diffusion-Based](https://awesome-repositories.com/f/artificial-intelligence-ml/sequence-generation/diffusion-based.md) — Utilizes denoising diffusion models to generate synthetic training sequences for improved model robustness.
- [Probability-Based Ranking](https://awesome-repositories.com/f/artificial-intelligence-ml/side-by-side-preference-ranking/probability-based-ranking.md) — Sort candidates by predicting the likelihood of a user clicking an item using deep factorized models. ([source](https://github.com/datawhalechina/fun-rec/tree/master/web_project))

### Content Management & Publishing

- [Candidate Retrieval APIs](https://awesome-repositories.com/f/content-management-publishing/headless-api-driven-services/content-delivery-apis/candidate-retrieval-apis.md) — Implements candidate retrieval mechanisms using vector representations and similarity searches to narrow the search space. ([source](https://github.com/datawhalechina/fun-rec#readme))

### Education & Learning Resources

- [Deep Learning Tutorials](https://awesome-repositories.com/f/education-learning-resources/deep-learning-tutorials.md) — Offers instructional content on predicting preferences through collaborative filtering and deep factorized models.
- [Generative Recommendation Tutorials](https://awesome-repositories.com/f/education-learning-resources/llm-tutorials/generative-recommendation-tutorials.md) — Serves as a learning guide for building personalized content suggestion systems using LLMs.

### Software Engineering & Architecture

- [Cascade Recommendation Architectures](https://awesome-repositories.com/f/software-engineering-architecture/cascade-recommendation-architectures.md) — Implements a multi-stage cascade architecture to filter massive item pools through retrieval and ranking stages.
- [Hybrid Offline-Online Recommendation Pipelines](https://awesome-repositories.com/f/software-engineering-architecture/hybrid-offline-online-recommendation-pipelines.md) — Provides an industrial-grade architecture separating heavy offline model training from real-time online serving for low latency.

### User Interface & Experience

- [Item-to-Item Similarity](https://awesome-repositories.com/f/user-interface-experience/form-builders/builder-item-collapsers/builder-item-managers/list-item-markers/item-to-item-similarity.md) — Identifies potential items for users by calculating similarity between items for related content suggestions. ([source](https://github.com/datawhalechina/fun-rec/tree/master/web_project))

### Part of an Awesome List

- [Recommendation Reasoning](https://awesome-repositories.com/f/awesome-lists/ai/reasoning-frameworks/recommendation-reasoning.md) — Aligns item indices using linguistic semantics and reasoning frameworks to improve the logic behind content suggestions. ([source](https://github.com/datawhalechina/fun-rec/blob/master/README_en.md))

### Data & Databases

- [Diversity-Aware](https://awesome-repositories.com/f/data-databases/similarity-search/diversity-aware.md) — Balances relevance with variety by re-ranking items using greedy algorithms or transformer models. ([source](https://github.com/datawhalechina/fun-rec/blob/master/README_en.md))
