# magic-research/magic-animate

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/magic-research-magic-animate).**

10,908 stars · 1,088 forks · Python · BSD-3-Clause

## Links

- GitHub: https://github.com/magic-research/magic-animate
- Homepage: https://showlab.github.io/magicanimate/
- awesome-repositories: https://awesome-repositories.com/repository/magic-research-magic-animate.md

## Description

Magic Animate is a diffusion model video generator designed for human image animation. It transforms a static human photo into a temporally consistent video by mapping movements from a reference motion clip, acting as a tool to create realistic animations from a single image.

The system ensures visual stability and minimizes flicker through temporal attention injection and motion-controlled noise scheduling. To accelerate the generation of high-resolution video, it includes a distributed GPU inference engine that splits model workloads across multiple graphics cards.

The project covers a comprehensive animation pipeline, including appearance encoding, denoising processes, and a two-stage training regime. It provides both single-GPU and multi-GPU execution paths and includes a Gradio web interface for uploading assets and previewing results.

## Tags

### Graphics & Multimedia

- [Image-to-Video Animators](https://awesome-repositories.com/f/graphics-multimedia/image-editing-processing/image-processing/image-sequence-processors/animation-frame-sequencers/generative-animation-sequences/image-to-video-animators.md) — Transforms static human photos into temporally consistent videos by mapping movements from a reference motion clip.

### Artificial Intelligence & ML

- [Temporal Attention](https://awesome-repositories.com/f/artificial-intelligence-ml/attention-mechanisms/spatio-temporal-attention/temporal-attention.md) — Injects cross-frame attention layers into the diffusion model to enforce temporal consistency across video frames.
- [Cross-Frame Attention Layers](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-pipelines/text-to-video-generators/cross-attention-conditioning/cross-frame-attention-layers.md) — Implements cross-frame attention injection to ensure visual stability and minimize flicker across the generated video sequence.
- [Video Diffusion Models](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-models/latent-diffusion-models/video-diffusion-models.md) — Utilizes a video diffusion model to iteratively denoise latent representations for temporally consistent animation.
- [Latent Conditioning Mechanisms](https://awesome-repositories.com/f/artificial-intelligence-ml/latent-conditioning-mechanisms.md) — Uses an appearance encoder to provide latent conditioning that steers the denoising process for animation.
- [Multimodal Image Encoders](https://awesome-repositories.com/f/artificial-intelligence-ml/multimodal-image-encoders.md) — Encodes a static human image into a numerical representation to condition the denoising UNet.
- [Sparse-Frame Appearance Encoders](https://awesome-repositories.com/f/artificial-intelligence-ml/sparse-frame-appearance-encoders.md) — Encodes the person's visual identity from a reference image to maintain consistency across the generated video.
- [Temporally Consistent Embeddings](https://awesome-repositories.com/f/artificial-intelligence-ml/temporally-consistent-embeddings.md) — Ensures visual representations remain stable and coherent across consecutive frames to minimize flicker.
- [Distributed Model Parallelism](https://awesome-repositories.com/f/artificial-intelligence-ml/distributed-model-parallelism.md) — Splits the diffusion model across multiple GPUs by assigning specific subsets of temporal frames to each device.
- [Staged Training Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-models/diffusion-models/diffusion-model-training/staged-training-pipelines.md) — Implements a staged training strategy that optimizes appearance and temporal modules separately before performing global fine-tuning.
- [Multi-GPU Video Inference Accelerators](https://awesome-repositories.com/f/artificial-intelligence-ml/gpu-accelerated-inference/multi-gpu-video-inference-accelerators.md) — Distributes video generation workloads across multiple GPUs to reduce inference time for high-resolution output.
- [Model Parallelism](https://awesome-repositories.com/f/artificial-intelligence-ml/machine-learning/infrastructure/model-training-and-tuning/training-frameworks/model-training-pipelines/model-parallelism.md) — Distributes the diffusion workload across multiple GPUs, with each processor handling specific temporal frames.
- [Motion-Driven Schedulers](https://awesome-repositories.com/f/artificial-intelligence-ml/noise-level-sampling-strategies/per-frame-noise-level-schedulers/motion-driven-schedulers.md) — Coordinates the diffusion process by aligning noise patterns with a driving motion sequence to guide frame generation.

### Part of an Awesome List

- [Motion-Aligned](https://awesome-repositories.com/f/awesome-lists/ai/gaussian-noise-diffusion/denoising-schedulers/motion-aligned.md) — Drives animation by scheduling noise patterns that follow a driving motion sequence to align frames with reference motion.
