# bytedance/monolith

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/bytedance-monolith).**

9,271 stars · 714 forks · Python · other · archived

## Links

- GitHub: https://github.com/bytedance/monolith
- awesome-repositories: https://awesome-repositories.com/repository/bytedance-monolith.md

## Description

Monolith is a distributed recommendation model framework and asynchronous training engine designed to build and train large-scale deep learning architectures. It functions as a distributed model trainer that processes massive datasets across multiple compute nodes using asynchronous update mechanisms.

The system features a dedicated embedding table manager that creates unique, feature-isolated tables to prevent representation collisions. It also includes a real-time weight updater to capture immediate changes in user interest and data hotspots through continuous parameter synchronization.

The framework covers the orchestration of distributed compute nodes, parameter server administration, and the construction of deep learning model graphs for recommendation tasks. These capabilities support asynchronous gradient updates and the management of complex feature representations.

## Tags

### Artificial Intelligence & ML

- [Recommendation Architectures](https://awesome-repositories.com/f/artificial-intelligence-ml/recommendation-architectures.md) — Provides a deep learning framework specifically designed for building and training large-scale recommendation models. ([source](https://github.com/bytedance/monolith/tree/master/markdown/demo/))
- [Recommendation Models](https://awesome-repositories.com/f/artificial-intelligence-ml/recommendation-models.md) — Offers a framework for building and training large-scale recommendation models using complex feature representations.
- [Distributed Deep Learning](https://awesome-repositories.com/f/artificial-intelligence-ml/distributed-deep-learning.md) — Provides a system for scaling deep learning model training across multiple compute nodes and GPUs.
- [Distributed Training](https://awesome-repositories.com/f/artificial-intelligence-ml/distributed-training-frameworks/distributed-training.md) — Enables distributed training across multiple compute nodes to handle large datasets and complex recommendation tasks. ([source](https://cdn.jsdelivr.net/gh/bytedance/monolith@master/README.md))
- [Embedding Table Isolation](https://awesome-repositories.com/f/artificial-intelligence-ml/feature-scale-normalization/embedding-table-isolation.md) — Allocates feature-isolated embedding tables to prevent representation collisions in large-scale recommendation models.
- [Distributed Gradient Synchronization](https://awesome-repositories.com/f/artificial-intelligence-ml/gradient-computation/distributed-gradient-synchronization.md) — Provides mechanisms for coordinating asynchronous gradient updates across multiple compute nodes to reduce distributed training bottlenecks.
- [Large-Scale Training Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/large-scale-training-frameworks.md) — Provides a large-scale training framework for deep learning architectures used in personalized content delivery.
- [Asynchronous Training](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/distributed-and-scaling-strategies/asynchronous-training-utilities/asynchronous-training.md) — Implements an asynchronous training engine that decouples gradient updates to handle massive datasets across compute nodes.
- [Dynamic Weight Updates](https://awesome-repositories.com/f/artificial-intelligence-ml/model-weight-management/dynamic-weight-updates.md) — Updates model parameters in real time to capture immediate changes in user interest and data hotspots.
- [Parameter Servers](https://awesome-repositories.com/f/artificial-intelligence-ml/parameter-servers.md) — Utilizes a parameter server architecture to distribute and synchronize model weights across a cluster of servers.
- [Real-Time Parameter Synchronization](https://awesome-repositories.com/f/artificial-intelligence-ml/real-time-parameter-synchronization.md) — Ships a real-time weight updater that continuously synchronizes parameters to capture immediate changes in user interest.
- [Embedding Table Management](https://awesome-repositories.com/f/artificial-intelligence-ml/speaker-embeddings/embedding-management/embedding-table-management.md) — Manages unique embedding tables for different identity features to prevent representation collisions in large models.
- [Feature Hashing](https://awesome-repositories.com/f/artificial-intelligence-ml/vector-embeddings/feature-hashing.md) — Uses specialized embedding strategies and feature hashing to isolate identity features and prevent collisions.
- [Compute Graph Builders](https://awesome-repositories.com/f/artificial-intelligence-ml/compute-graph-builders.md) — Includes utilities for constructing structured computational graphs that map feature transformations and layer connections.
- [Deep Learning Architectures](https://awesome-repositories.com/f/artificial-intelligence-ml/deep-learning-architectures.md) — Supports the construction of complex deep learning architectures specifically tailored for recommendation tasks.
- [Feature Embedding Tables](https://awesome-repositories.com/f/artificial-intelligence-ml/speaker-embeddings/embedding-management/feature-embedding-tables.md) — Includes a dedicated manager that creates isolated embedding tables to prevent collisions between different identity features.

### Data & Databases

- [Parameter Synchronization](https://awesome-repositories.com/f/data-databases/real-time-data-synchronization/parameter-synchronization.md) — Continuously pushes parameter updates from training nodes to servers to capture immediate user trends and data hotspots.

### DevOps & Infrastructure

- [Training Node Orchestration](https://awesome-repositories.com/f/devops-infrastructure/multi-node-orchestration/training-node-orchestration.md) — Orchestrates distributed compute nodes to spread deep learning workloads and process massive datasets through parallel execution.

### Software Engineering & Architecture

- [Embedding Collision Prevention](https://awesome-repositories.com/f/software-engineering-architecture/contract-upgradeability/storage-layout-preservation/storage-collision-prevention/embedding-collision-prevention.md) — Creates unique embedding tables for different identity features to ensure distinct representations and prevent collisions. ([source](https://cdn.jsdelivr.net/gh/bytedance/monolith@master/README.md))

### Development Tools & Productivity

- [Real-Time Runtime Updates](https://awesome-repositories.com/f/development-tools-productivity/configuration-updates/real-time-runtime-updates.md) — Performs real-time weight updates to reflect changing user interests without interrupting the training process. ([source](https://cdn.jsdelivr.net/gh/bytedance/monolith@master/README.md))