# yangchris11/samurai

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/yangchris11-samurai).**

7,083 stars · 496 forks · Python · Apache-2.0

## Links

- GitHub: https://github.com/yangchris11/samurai
- Homepage: https://yangchris11.github.io/samurai/
- awesome-repositories: https://awesome-repositories.com/repository/yangchris11-samurai.md

## Description

SAMURAI is a zero-shot visual tracking model that adapts the Segment Anything architecture for video object segmentation. It uses a first-frame prompt, such as a bounding box or mask, to initialize tracking, then employs a motion-aware memory mechanism that stores and updates temporal motion features across frames to guide mask refinement. An online memory update strategy continuously refreshes this memory with new frame predictions, while temporal motion encoding computes optical flow between consecutive frames to inform object boundary and occlusion handling.

The system is designed for real-time inference, optimizing model forward passes and memory operations to achieve tracking speeds on standard hardware. It operates without requiring any task-specific fine-tuning or training, enabling zero-shot tracking of unseen objects in video. SAMURAI also handles objects that enter the video after the initial frame, detecting and following them once they appear.

The project provides a custom video demo runner that accepts a video file or frame directory and a first-frame bounding box to produce tracking results. It supports semi-supervised video object segmentation, generating segmentation predictions using a pre-trained model and ground-truth masks from the first frame.

## Tags

### Artificial Intelligence & ML

- [Zero-Shot Video Tracking Pipelines](https://awesome-repositories.com/f/artificial-intelligence-ml/zero-shot-inference/zero-shot-segmentations/zero-shot-video-tracking-pipelines.md) — Adapts a pre-trained segmentation model to track unseen objects without task-specific fine-tuning.
- [Motion-Guided Mask Refinements](https://awesome-repositories.com/f/artificial-intelligence-ml/agentic-systems-frameworks/agentic-workflows/iterative-refinement-workflows/mask-refinement-loops/motion-guided-mask-refinements.md) — Incorporates temporal motion cues to improve mask predictions for fast-moving or occluded objects. ([source](https://yangchris11.github.io/samurai/))
- [Segment Anything Model Adaptations](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/image-segmentation/segmentation-model-training/segmentation-model-testing/segment-anything-model-adaptations.md) — Repurposes the Segment Anything model for video tracking by injecting motion cues into its decoder.
- [Motion-Aware Video Segmentors](https://awesome-repositories.com/f/artificial-intelligence-ml/video-object-segmentations/motion-aware-video-segmentors.md) — Refines object segmentation masks by incorporating temporal motion cues for fast-moving or occluded objects. ([source](https://github.com/yangchris11/samurai/blob/master/sam2/tools/README.md))
- [Semi-Supervised Video Object Segmentors](https://awesome-repositories.com/f/artificial-intelligence-ml/video-object-segmentations/semi-supervised-video-object-segmentors.md) — Generates segmentation masks for tracked objects using a first-frame annotation and a pre-trained model.
- [Semi-Supervised Video Object Trackers](https://awesome-repositories.com/f/artificial-intelligence-ml/video-object-segmentations/semi-supervised-video-object-trackers.md) — Generates segmentation predictions using ground-truth masks from the first frame and a pre-trained model.
- [Semi-Supervised Video Segmentors](https://awesome-repositories.com/f/artificial-intelligence-ml/video-object-segmentations/semi-supervised-video-segmentors.md) — Generates segmentation predictions using a pre-trained model and ground-truth masks from the first frame. ([source](https://github.com/yangchris11/samurai/blob/master/sam2/tools/README.md))
- [Zero-Shot Object Trackers](https://awesome-repositories.com/f/artificial-intelligence-ml/video-object-tracking/zero-shot-object-trackers.md) — Leverages a motion-aware memory mechanism to track objects without requiring any training or fine-tuning. ([source](https://yangchris11.github.io/samurai/))
- [Zero-Shot Video Segmentors](https://awesome-repositories.com/f/artificial-intelligence-ml/zero-shot-inference/zero-shot-segmentations/zero-shot-video-segmentors.md) — Segments and tracks any object in video without prior training using motion-aware memory.
- [Zero-Shot Video Trackers](https://awesome-repositories.com/f/artificial-intelligence-ml/zero-shot-inference/zero-shot-segmentations/zero-shot-video-trackers.md) — Tracks any object in video without prior training or fine-tuning using a pre-trained model.
- [Late-Appearing Object Trackers](https://awesome-repositories.com/f/artificial-intelligence-ml/video-object-tracking/late-appearing-object-trackers.md) — Handles objects that enter the video after the initial frame by detecting and following them.

### Graphics & Multimedia

- [First-Frame Prompt Initializations](https://awesome-repositories.com/f/graphics-multimedia/frame-by-frame-stream-processing/first-frame-prompt-initializations.md) — Provides the initial bounding box or mask that seeds the entire tracking process.
- [Optical Flow Encodings](https://awesome-repositories.com/f/graphics-multimedia/motion-vector-calculation/motion-estimation/optical-flow-encodings.md) — Computes optical flow between consecutive frames to inform object boundary and occlusion handling.
- [Temporal Motion Feature Memories](https://awesome-repositories.com/f/graphics-multimedia/pixel-motion-analysis/motion-masks/temporal-motion-feature-memories.md) — Stores and updates temporal motion features to guide mask refinement across frames.
- [Real-Time Model Inference on Frames](https://awesome-repositories.com/f/graphics-multimedia/video-frame-processing/real-time-model-inference-on-frames.md) — Optimizes model forward passes and memory operations for real-time tracking on standard hardware.

### Software Engineering & Architecture

- [Online Memory Refreshes](https://awesome-repositories.com/f/software-engineering-architecture/file-based-project-storage/project-memory-banks/memory-knowledge-updates/memory-record-updaters/online-memory-refreshes.md) — Continuously refreshes motion-aware memory with new frame predictions for consistent tracking.
