Map Anything

Map-anything is a 3D scene reconstruction framework and neural geometry estimator designed to transform two-dimensional images into metric three-dimensional spatial representations using feed-forward neural networks. It provides a specialized toolkit for predicting camera intrinsics and ray directions from single images without requiring external geometric metadata.

The project includes a 3D model benchmarking suite that utilizes a unified model wrapper to standardize outputs from diverse reconstruction models. This allows for consistent evaluation and accuracy measurement across various spatial datasets. To facilitate downstream use, it includes a COLMAP data exporter that converts neural reconstruction predictions into formats compatible with photogrammetry and splatting pipelines.

The framework covers a broad capability surface including distributed geometry model training, multi-node cluster orchestration, and inference memory optimization. It also provides tools for metric depth visualization, spatial data standardization, and geometry artifact filtering using normal-based masking.

Features

3D Reconstruction - Transforms two-dimensional images into metric three-dimensional spatial representations using feed-forward neural networks.

Metric 3D Scene Reconstruction - Provides a comprehensive framework for transforming 2D images into metric 3D spatial representations.

3D Spatial AI - Transforms two-dimensional images into three-dimensional spatial representations using a feed-forward metric network.

Metric Coordinate Mapping - Transforms two-dimensional images into three-dimensional spatial representations using a neural network that predicts metric coordinates.

Model Benchmarking Suites - Implements a standardized interface for evaluating the accuracy of multiple 3D reconstruction models.

Geometry Model Training - Supports training and fine-tuning of reconstruction models using specialized pose loss functions.

Reconstruction - Evaluates reconstruction accuracy using standardized datasets across different numbers of input views.

3D Reconstruction Benchmarks - Provides a benchmarking suite to standardize and measure the accuracy of diverse 3D reconstruction models.

Camera Intrinsic Predictions - Recovers camera ray directions and intrinsic parameters from single images without requiring external geometric metadata.

Ray-Direction Estimations - Recovers camera intrinsic parameters and ray directions from single images without requiring external geometric metadata.

Camera Geometry Estimation - Estimates camera ray directions and intrinsic parameters from single images without external metadata.

Vision Dataset Standardizers - Converts diverse spatial datasets into a uniform format to streamline training and cross-model evaluation.

Image Data Preprocessing - Performs undistortion and depth consistency calculations on image datasets to prepare them for 3D mapping.

Inference Memory Optimizations - Reduces GPU memory consumption during inference to allow processing of more views on limited hardware.

Distributed Training - Distributes large-scale geometry model training and fine-tuning workloads across multiple compute nodes.

Reconstruction Output Standardization - Wraps diverse reconstruction models in a uniform format to align 3D points, camera poses, and confidence scores.

Third-Party Model Integration - Runs multiple third-party reconstruction models through a single interface to ensure consistent output formats for evaluation.

Vision Model Fine-Tuning - Optimizes spatial reconstruction models using a modular training pipeline and comprehensive datasets.

Photogrammetry Pipeline Integrations - Converts neural reconstruction predictions into formats compatible with photogrammetry tools like COLMAP.

Data Visualization - Renders standardized 3D spatial data to verify the quality of neural reconstruction representations.

Data Export - Exports neural reconstruction predictions into structured formats compatible with photogrammetry and splatting pipelines.

Metric Depth Mapping - Processes depth maps and camera poses to maintain geometric consistency in 3D reconstructions.

Geometric Constraint Integrations - Integrates camera intrinsics, ray directions, and depth maps to improve the accuracy of 3D reconstructions.

Geometry Artifact Filtering - Denoises and removes edge artifacts from geometry outputs using normal-based masking and depth consistency checks.

Metric 3D Representations - Renders 3D reconstructed scenes using physically accurate spatial measurements to verify reconstruction quality.

Photogrammetry Format Exports - Converts reconstruction predictions into files compatible with external photogrammetry and splatting pipelines.

Geometry Masking - Removes edge artifacts and low-confidence regions by applying depth consistency checks and normal-based filters to outputs.

Unified Model Interfaces - Wraps reconstruction models in a unified interface to ensure consistent output formats for benchmarking.

Unified Model Wrappers - Standardizes diverse third-party reconstruction models into a single interface to ensure consistent output formats for benchmarking.

Reconstruction Benchmarking - Executes standardized evaluation scripts using specific checkpoints and machine configurations to measure project performance.

facebookresearchmap-anything

Features

Star history