30 open-source projects similar to nerfies/nerfies.github.io, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Nerfies.github.io alternative.
This project is a PyTorch implementation of a Neural Radiance Field framework. It serves as a 3D scene synthesizer and differentiable volumetric renderer used to train volumetric representations of scenes by predicting color and density for 3D spatial coordinates. The system enables novel view synthesis, allowing for the generation of new images of complex 3D scenes from previously unseen perspectives. It supports 3D scene reconstruction by processing 2D images and camera poses to build a digital volumetric representation of a physical space. The framework includes capabilities for 3D model
This project is a framework for neural radiance fields used to synthesize three-dimensional environments from sets of two-dimensional images and camera poses. It functions as a volumetric rendering engine and scene synthesizer that optimizes neural representations of spatial volumes to generate novel views of complex 3D scenes. The system implements a coordinate encoding system that transforms spatial coordinates into high-dimensional space to capture high-frequency geometric details. It also includes a neural mesh extractor that converts trained radiance fields into triangle meshes via march
PyTorch3D is a 3D geometric deep learning library and mesh processing toolkit designed for learning from point clouds and complex 3D surface geometries. It provides a collection of reusable components and data structures for deep learning with 3D data, including a framework for training and evaluating neural radiance fields to enable photorealistic view synthesis. The project features a differentiable 3D renderer that converts meshes and point clouds into 2D images while allowing gradients to flow back into the geometry and textures. This enables 3D shape optimization, where mesh geometry, te
Nerfstudio is a modular development framework for training, visualizing, and exporting three-dimensional scene representations derived from two-dimensional image datasets. It provides a neural scene reconstruction pipeline that converts raw images and camera data into high-fidelity 3D assets and cinematic video using a differentiable volumetric renderer. The system features an interactive web-based visualizer that allows users to monitor training progress and inspect neural scene geometry in real time. It decouples neural network architectures from the training loop through a standardized mod
ml-sharp is a neural radiance field framework designed for single-image 3D reconstruction. It uses a neural network to predict 3D geometry and appearance from a single photograph in a single feedforward pass. The system generates metric 3D scene representations and includes a real-time view synthesizer for producing high-resolution images of new viewpoints. It also features a camera trajectory renderer that creates video sequences by moving a virtual camera through the predicted 3D space. The project covers coordinate-based neural rendering, 3D Gaussian representation regression, and real-ti
Gaussian Splatting is a computational framework designed to transform sparse sets of two-dimensional photographs into photorealistic, interactive three-dimensional scene representations. The system functions as a reconstruction tool and rendering engine, enabling the conversion of image data into volumetric models that support novel view synthesis. The project represents scenes as a collection of anisotropic three-dimensional Gaussians, which store position, opacity, color, and covariance data. It distinguishes itself through a differentiable tile-based rasterization process that projects the
Depth-Anything-3 is a collection of core model implementations for depth prediction, multi-view geometry estimation, and RGB-D spatial pipelines. It includes a monocular depth estimation model for predicting depth maps from single images or video, and a 3D Gaussian splatting generator that predicts parameters to synthesize high-fidelity novel views of a scene. The project provides a multi-view geometry estimator for calculating spatially consistent depth and camera poses across synchronized visual inputs. It also functions as a visual SLAM enhancement tool designed to reduce drift and improve
Neuralangelo is a neural surface reconstruction framework that transforms two-dimensional image sequences and multi-view photography into high-fidelity 3D meshes. It implements a pipeline for training neural radiance fields to represent complex scenes as digital geometry. The project utilizes a signed distance function for surface representation and multi-resolution hash encoding to capture both coarse and fine geometric details. It employs differentiable volume rendering and gradient-based eikonal regularization to ensure the learned distance functions remain physically plausible. The syste
This project is a diffusion-based 3D generator and image-to-3D reconstruction system. It translates natural language descriptions or two-dimensional images into three-dimensional assets using neural radiance fields and diffusion models. The system utilizes score-distillation sampling and diffusion-based guidance to refine 3D shapes without requiring 3D training data. It includes specialized tools for transforming neural representations into exportable meshes with texture and material data, as well as a pipeline for iterative optimization of geometry and textures. The project covers a broad r
gsplat is a high-performance differentiable rasterization engine for 3D Gaussian splatting, designed for real-time novel view synthesis from 2D images. It provides a complete pipeline for reconstructing 3D scenes by optimizing differentiable Gaussian representations, training models from COLMAP-processed captures or proprietary device files, and generating new viewpoints through a CUDA-accelerated rendering backend. The framework distinguishes itself through memory-optimized CUDA kernels that reduce training memory usage by up to 4x compared to standard implementations while matching publishe
This project is an RGB-D image inpainting tool and framework for 3D photo reconstruction. It transforms single 2D images into 3D content by estimating monocular depth and synthesizing missing color and depth data to fill occluded regions. The system uses a layered depth image representation to manage scene boundaries and pixel connectivity. This allows for novel view synthesis, enabling the generation of videos that simulate motion parallax effects from different camera perspectives. The project covers a range of spatial modeling capabilities, including depth map estimation, disparity-based
Threestudio is a 3D generative AI framework designed to create three-dimensional assets from text prompts and images. It provides specialized pipelines for text-to-3D generation and image-to-3D reconstruction, utilizing a neural radiance field trainer to produce geometry and textures. The framework is distinguished by its support for hybrid geometry backends, including signed distance functions, tetrahedra grids, and volume grids. It employs score distillation sampling to guide the generation process and features a modular plugin system for loading custom modules and nodes. The system covers
f3d is a fast 3D model viewer and rendering engine designed for visualizing 3D meshes, CAD files, and point clouds. It operates across multiple deployment profiles, functioning as a lightweight desktop application, a scientific data visualizer for volumetric and scalar datasets, a headless rendering engine for automated image generation, and a WebAssembly-based renderer for web applications. The project distinguishes itself through specialized support for Gaussian Splatting scene reconstructions and the ability to visualize complex scientific formats such as VTK, NetCDF, and HDF. It features
pifuhd is a 3D human reconstruction framework that generates high-resolution 3D meshes of people from a single 2D image. It utilizes pixel-aligned implicit functions to map image pixels to 3D space, predicting surface occupancy and distance to create detailed geometry. The system includes a pipeline for creating digital human assets, moving from 2D image feature projection to the extraction of discrete triangular meshes. It features specialized tools for refining these models, including a post-processor that removes geometric artifacts by isolating the largest connected component of the mesh.
Instant-ngp is a high-performance neural graphics engine and toolkit designed for 3D reconstruction and the rendering of neural radiance fields. It provides an integrated framework for generating photorealistic volumetric representations from sets of two-dimensional images by optimizing continuous neural scene models. The project distinguishes itself through a focus on rapid training and real-time inference, achieved by mapping spatial coordinates into compact feature grids. By utilizing multiresolution hash encoding and fused processing kernels, the system minimizes computational overhead an
VGGT is a computer vision framework designed for neural scene reconstruction and 3D environmental modeling. It utilizes a feed-forward neural architecture to process input images, simultaneously inferring camera parameters, depth maps, and point trajectories to generate dense 3D point clouds. The system distinguishes itself by integrating multi-view geometry with temporal tracking, allowing it to maintain spatial consistency across sequential frames. By leveraging pretrained neural backbones, the framework extracts robust visual features that support complex geometric tasks, including the ana
openMVS is a multi-view stereo library and photogrammetry pipeline used for 3D scene reconstruction. It transforms Structure from Motion data—specifically camera poses and sparse point clouds—into detailed 3D models consisting of dense point clouds and textured meshes. The project provides a sequence of processing stages to densify point clouds, generate 3D surface meshes, and apply photorealistic textures. It uses multi-view texture blending to map accurate colors onto reconstructed geometry and employs iterative refinement to optimize mesh details. The system includes capabilities for impo
Meshroom is a node-based photogrammetry software designed to transform collections of two-dimensional images into three-dimensional models and scene geometry. It provides a visual interface for constructing and managing modular data pipelines, allowing users to automate complex computer vision tasks such as feature extraction, depth map estimation, and mesh generation. The software distinguishes itself through a distributed computational framework that dispatches resource-intensive tasks across local hardware or remote render farms. By utilizing a directed acyclic graph execution model, it en
This repository serves as a comprehensive research platform and toolkit for advancing machine learning, quantum computing, and large-scale scientific data analysis. It provides foundational frameworks for developing complex algorithmic systems, offering the necessary infrastructure for distributed training, computational graph execution, and high-performance model development. The project distinguishes itself by integrating specialized research domains with robust, privacy-preserving methodologies. It supports diverse scientific discovery through tools for quantum simulation, physics-informed
This repository contains the code release for Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. This implementation is written in JAX, and is a fork of Google's JaxNeRF implementation. Contact Jon Barron if you encounter any issues.
This is a framework for training and sampling diffusion models to generate high-fidelity images, video, and 4D assets. It provides a modular environment for managing generative AI training pipelines, including the handling of datasets, noise sampling, and loss weighting to stabilize the creation of synthetic content. The project features a modular model configuration system that uses YAML-based assembly to define network submodules and conditioners. It also includes a dedicated toolset for AI image watermarking, allowing for the embedding and detection of invisible markers to verify the origi
OpenDroneMap (ODM) is an open-source aerial drone photogrammetry pipeline that converts 2D images into georeferenced 3D models, orthophotos, point clouds, and digital elevation maps. At its core, the OpenDroneMap Processing Engine orchestrates a complete Structure-from-Motion workflow, from feature extraction through dense reconstruction and tiled output generation, purpose-built for transforming drone-captured imagery into geospatial data products. The toolkit distinguishes itself through GPU-accelerated SIFT feature extraction using CUDA-capable NVIDIA graphics cards, roughly doubling proce
GET3D is a generative 3D mesh model and rendering framework designed to synthesize high-quality textured shapes and tetrahedral meshes. It functions as an image-to-3D reconstructor and text-to-3D generator, utilizing a differentiable 3D renderer to produce realistic visual perspectives and material effects. The system enables the creation of 3D assets from single 2D images, point clouds, or descriptive text prompts. It features a latent space interpolator for creating smooth transitions between different 3D objects and supports the independent control of geometry and texture. The project cov
COLMAP is a 3D scene reconstruction suite and C++ geometry library that implements a full structure-from-motion pipeline. It functions as a GPU-accelerated photogrammetry tool and multi-view stereo framework designed to produce dense 3D geometry and watertight meshes from collections of 2D images. The project distinguishes itself through hardware-accelerated feature extraction and a modular camera modeling system that supports perspective, fisheye, and equirectangular lens types. It employs vocabulary tree image retrieval to efficiently identify similar images in large datasets and provides P
This project is a static educational website and comprehensive curriculum focused on computer vision and deep learning. It serves as a public repository of instructional materials, lecture notes, and technical guides specifically detailing convolutional neural networks and visual recognition. The site is developed using static-site generation to host course documentation and student project directories. It provides structured academic resources that guide learners through image classification, generative modeling, and the implementation of various neural network architectures. The curriculum
jetson-inference is a set of libraries and tools for executing optimized deep learning models on embedded GPU hardware. Its primary purpose is to enable real-time computer vision and AI inference at the edge with low latency and high throughput. The project distinguishes itself through high-performance streaming analytics and the ability to execute concurrent AI pipelines on auto-grade silicon. It provides specialized support for multi-sensor stream processing, utilizing zero-copy data transport to load camera frames directly into GPU memory. The codebase covers a broad surface of capabiliti
This repository is a comprehensive collection of functional 2D and 3D demo projects and implementation samples for the Godot Game Engine. It serves as an interactive tutorial and reference library, providing a working codebase to demonstrate how to apply engine features in real-world scenarios. The collection focuses on practical implementation guides, covering a wide array of technical capabilities from basic engine fundamentals to advanced rendering and scripting techniques. It allows users to study the application of node-based composition, asset pipelines, and game logic through direct ex
ORB_SLAM2 is a visual simultaneous localization and mapping system that tracks camera movement and builds 3D environments from image data. It functions as a real-time visual odometry tool and sparse 3D reconstructor, computing the position and orientation of a camera while generating a point cloud map of a physical space. The system utilizes a camera relocalization engine to identify a camera's position within a known map after tracking failure or system restarts. It incorporates a spatial tracker to enable the precise insertion and composition of virtual 3D objects into real-world planar reg