# depthanything/depth-anything-v2

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/depthanything-depth-anything-v2).**

8,320 stars · 862 forks · Python · Apache-2.0

## Links

- GitHub: https://github.com/DepthAnything/Depth-Anything-V2
- Homepage: https://depth-anything-v2.github.io
- awesome-repositories: https://awesome-repositories.com/repository/depthanything-depth-anything-v2.md

## Topics

`monocular-depth-estimation`

## Description

Depth-Anything-V2 is a computer vision foundation model designed for general-purpose spatial understanding and depth perception. It functions as a monocular depth estimation model that predicts relative and absolute depth maps from single images or video sequences.

The project provides specialized tools for both relative depth estimation and metric depth calculation, allowing for the determination of absolute physical distances in indoor and outdoor environments. It includes a video depth estimation framework that ensures temporal consistency across sequential frames to maintain stable depth predictions.

The system utilizes a multi-scale model hierarchy to balance inference speed and accuracy, extracting global context through a transformer-based encoder. Its capabilities cover spatial scene understanding and the export of predicted depth results as grayscale or colorized images.

## Tags

### Artificial Intelligence & ML

- [Monocular Depth Estimators](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/computer-vision/object-pose-estimations/monocular-depth-estimators.md) — Provides a foundation model that infers three-dimensional spatial depth from single two-dimensional image inputs.
- [Computer Vision Models](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-models.md) — Provides a large-scale pre-trained neural network designed for general purpose spatial understanding and depth perception.
- [Depth Estimation](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/computer-vision/object-pose-estimations/monocular-depth-estimators/multi-view-depth-estimators/depth-estimation.md) — Calculates absolute distance measurements in indoor and outdoor scenes using scale-aware models.
- [Metric](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/computer-vision/object-pose-estimations/monocular-depth-estimators/multi-view-depth-estimators/depth-estimation/metric.md) — Calculates absolute distance measurements for indoor and outdoor scenes using specialized scale-aware models. ([source](https://depth-anything-v2.github.io/))
- [Relative](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/computer-vision/object-pose-estimations/monocular-depth-estimators/multi-view-depth-estimators/depth-estimation/relative.md) — Produces a relative depth map from a single input image using pre-trained foundation models. ([source](https://cdn.jsdelivr.net/gh/depthanything/depth-anything-v2@main/README.md))
- [Temporal Video](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/computer-vision/object-pose-estimations/monocular-depth-estimators/multi-view-depth-estimators/depth-estimation/temporal-video.md) — Generates depth maps for video sequences while maintaining temporal consistency across frames. ([source](https://cdn.jsdelivr.net/gh/depthanything/depth-anything-v2@main/README.md))
- [Video Depth Frameworks](https://awesome-repositories.com/f/artificial-intelligence-ml/computer-vision-systems/computer-vision/object-pose-estimations/monocular-depth-estimators/multi-view-depth-estimators/depth-estimation/video-depth-frameworks.md) — Provides a framework for generating temporally consistent depth maps across sequential video frames.
- [Relative-to-Metric Depth Scaling](https://awesome-repositories.com/f/artificial-intelligence-ml/relative-to-metric-depth-scaling.md) — Translates dimensionless relative depth maps into absolute distance measurements using scale-aware model variants.
- [Video Depth Analysis](https://awesome-repositories.com/f/artificial-intelligence-ml/generative-ai-resources/diffusion-visual-models/generative-ai-tasks/video-depth-analysis.md) — Generates consistent depth maps across video frames to understand the three dimensional structure of moving scenes.
- [Temporal Prediction Smoothing](https://awesome-repositories.com/f/artificial-intelligence-ml/sequence-modeling/temporal-sequence-processors/temporal-smoothing-filters/temporal-prediction-smoothing.md) — Ensures depth predictions remain stable and smooth across consecutive video frames to reduce jitter.
- [Transformer Encoders](https://awesome-repositories.com/f/artificial-intelligence-ml/transformer-encoders.md) — Uses a vision transformer architecture to extract global context and high-resolution spatial features.
- [Unsupervised Pre-training](https://awesome-repositories.com/f/artificial-intelligence-ml/unsupervised-pre-training.md) — Implements unsupervised pre-training on massive unlabeled datasets to learn general depth representations.

### Part of an Awesome List

- [Spatial Understanding](https://awesome-repositories.com/f/awesome-lists/ai/spatial-understanding.md) — Extracts fine-grained geometric information from images to perceive the layout of a physical space.

### Graphics & Multimedia

- [Metric Depth Mapping](https://awesome-repositories.com/f/graphics-multimedia/depth-accuracy-metrics/metric-depth-mapping.md) — Determines absolute physical distance between the camera and objects in indoor or outdoor environments.

### DevOps & Infrastructure

- [Model Size Variants](https://awesome-repositories.com/f/devops-infrastructure/worker-scaling/independent-model-component-scaling/model-size-variants.md) — Offers a multi-scale model hierarchy with various parameter counts to balance inference speed and accuracy.
