30 open-source projects similar to facebookresearch/pifuhd, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Pifuhd alternative.
InstantMesh is a neural 3D reconstruction tool and single-image 3D mesh generator. It utilizes a sparse-view large reconstruction model to convert a single two-dimensional image into a three-dimensional object mesh. The system functions as a textured 3D mesh exporter, saving generated objects with either vertex colors or full texture maps for use in external rendering software. The framework covers a range of capabilities including feed-forward geometry inference, single-image depth estimation, and neural radiance fields. It also supports differentiable mesh rendering and workflows for spars
ComfyUI-3D-Pack is a suite of custom nodes for ComfyUI that enables 3D asset generation and rendering within a node-based workflow. It provides a set of tools for reconstructing textured three-dimensional meshes and volumetric scenes from single images, multi-view images, or text prompts. The system includes a Gaussian splatting generator for creating high-fidelity volumetric 3D scene representations and a multi-view image generator to produce consistent image sets for reconstruction. It also features a single image 3D mesh tool to build geometry from a single 2D source. The toolset covers 3
This project is a diffusion-based 3D generator and image-to-3D reconstruction system. It translates natural language descriptions or two-dimensional images into three-dimensional assets using neural radiance fields and diffusion models. The system utilizes score-distillation sampling and diffusion-based guidance to refine 3D shapes without requiring 3D training data. It includes specialized tools for transforming neural representations into exportable meshes with texture and material data, as well as a pipeline for iterative optimization of geometry and textures. The project covers a broad r
vrn is a 3D face reconstruction tool that generates three-dimensional volumetric representations of human faces from single two-dimensional images. It utilizes a volumetric convolutional neural network regression model to predict 3D volume data directly from image pixels. The system converts these volumetric predictions into 3D meshes through isosurface extraction and vertex coloring. It further applies realistic surface details by mapping two-dimensional image pixels onto the resulting 3D mesh using nearest-neighbor texture projection. The project provides capabilities for single-image dept
ml-sharp is a neural radiance field framework designed for single-image 3D reconstruction. It uses a neural network to predict 3D geometry and appearance from a single photograph in a single feedforward pass. The system generates metric 3D scene representations and includes a real-time view synthesizer for producing high-resolution images of new viewpoints. It also features a camera trajectory renderer that creates video sequences by moving a virtual camera through the predicted 3D space. The project covers coordinate-based neural rendering, 3D Gaussian representation regression, and real-ti
Wonder3D is a diffusion-based system for single image 3D reconstruction. It generates high-detail 3D meshes from a single input image by producing consistent multi-view normal maps and color images. The pipeline functions as a multi-view normal map generator and a textured mesh extractor. It uses cross-domain multi-view synthesis to create view-dependent maps, which are then converted into 3D geometry through radiance fusion and memory-efficient surface reconstruction. The project covers 3D mesh generation, multi-view generation, and textured 3D modeling. It also includes capabilities for tr
Hunyuan3D-2.1 is a generative 3D framework and image-to-3D pipeline that transforms single 2D images into textured 3D geometries. It functions as an asset generator that produces high-quality 3D meshes and textures using a flow-matching system. The project includes a specialized synthesizer for creating photorealistic textures with physically based rendering properties. These tools allow for the simulation of metallic reflections and light interactions on generated models. The system covers 3D asset pipeline automation through a sequence of shape generation and mesh refinement. It also provi
Threestudio is a 3D generative AI framework designed to create three-dimensional assets from text prompts and images. It provides specialized pipelines for text-to-3D generation and image-to-3D reconstruction, utilizing a neural radiance field trainer to produce geometry and textures. The framework is distinguished by its support for hybrid geometry backends, including signed distance functions, tetrahedra grids, and volume grids. It employs score distillation sampling to guide the generation process and features a modular plugin system for loading custom modules and nodes. The system covers
This project is a framework for neural radiance fields used to synthesize three-dimensional environments from sets of two-dimensional images and camera poses. It functions as a volumetric rendering engine and scene synthesizer that optimizes neural representations of spatial volumes to generate novel views of complex 3D scenes. The system implements a coordinate encoding system that transforms spatial coordinates into high-dimensional space to capture high-frequency geometric details. It also includes a neural mesh extractor that converts trained radiance fields into triangle meshes via march
MeshLab is an open-source 3D mesh processing system designed for editing and analyzing unstructured triangular meshes. It functions as a triangular mesh editor, a model visualization suite, and a conversion tool for transforming 3D mesh data between various file formats. The software provides tools for cleaning, healing, and optimizing large 3D models generated from raw digitization and scanning data. It enables the preparation of models for physical 3D printing and the application of surface textures to evaluate the visual appearance of digital models. The system covers a broad range of cap
This project is a PyTorch implementation of a Neural Radiance Field framework. It serves as a 3D scene synthesizer and differentiable volumetric renderer used to train volumetric representations of scenes by predicting color and density for 3D spatial coordinates. The system enables novel view synthesis, allowing for the generation of new images of complex 3D scenes from previously unseen perspectives. It supports 3D scene reconstruction by processing 2D images and camera poses to build a digital volumetric representation of a physical space. The framework includes capabilities for 3D model
CadQuery is a programmatic 3D modeling library and parametric CAD framework that allows for the generation of complex geometric solids and assemblies using a fluent Python API. It functions as a B-Rep geometry engine, enabling the construction of models through code rather than a graphical user interface. The project is built on the Open CASCADE Technology kernel and utilizes a method-chaining API to link geometric commands in sequence. It distinguishes itself through a workplane-based coordinate system and a powerful selection system that uses topological and spatial filtering to target spec
TripoSR is a single-image 3D reconstruction system that generates a high-quality textured mesh from one photograph in under half a second. It uses a feedforward neural network to process a single image through a transformer architecture, compressing the input into a compact latent vector that conditions the entire reconstruction pipeline. The system outputs a separate UV texture map with configurable resolution, replacing vertex colors for higher-quality surface detail. The project is built around an end-to-end differentiable pipeline that trains the entire reconstruction system from image in
This project is an RGB-D image inpainting tool and framework for 3D photo reconstruction. It transforms single 2D images into 3D content by estimating monocular depth and synthesizing missing color and depth data to fill occluded regions. The system uses a layered depth image representation to manage scene boundaries and pixel connectivity. This allows for novel view synthesis, enabling the generation of videos that simulate motion parallax effects from different camera perspectives. The project covers a range of spatial modeling capabilities, including depth map estimation, disparity-based
PRNet is a Python library for 3D facial reconstruction. It uses a deep learning regression model to predict 3D facial geometry and vertex colors from a single 2D input image to generate a textured mesh. The project provides tools for digital face swapping, allowing the replacement of a target face with a new image and blending textures to match the original pose. It also includes a framework for face texture swapping and blending to fit specific 3D poses. Additional capabilities cover facial analysis, including the detection and alignment of facial landmarks and the estimation of head pose a
GET3D is a generative 3D mesh model and rendering framework designed to synthesize high-quality textured shapes and tetrahedral meshes. It functions as an image-to-3D reconstructor and text-to-3D generator, utilizing a differentiable 3D renderer to produce realistic visual perspectives and material effects. The system enables the creation of 3D assets from single 2D images, point clouds, or descriptive text prompts. It features a latent space interpolator for creating smooth transitions between different 3D objects and supports the independent control of geometry and texture. The project cov
DreamGaussian is a generative system and converter designed to create textured three-dimensional models from text or images using Gaussian Splatting. It functions as a pipeline for transforming two-dimensional inputs into high-fidelity 3D assets. The project provides specific workflows for converting 3D Gaussian point clouds into standard textured mesh formats compatible with external 3D software. It supports the generation of textured meshes from single images via volumetric refinement and UV texture optimization, as well as the creation of 3D models from text prompts through intermediate im
Neuralangelo is a neural surface reconstruction framework that transforms two-dimensional image sequences and multi-view photography into high-fidelity 3D meshes. It implements a pipeline for training neural radiance fields to represent complex scenes as digital geometry. The project utilizes a signed distance function for surface representation and multi-resolution hash encoding to capture both coarse and fine geometric details. It employs differentiable volume rendering and gradient-based eikonal regularization to ensure the learned distance functions remain physically plausible. The syste
libigl is a C++ geometry processing library used for analyzing and manipulating 3D triangle and tetrahedral meshes. It functions as a numerical linear algebra suite and a mesh manipulation framework, integrating a geometric deformation engine to implement rigid and polyharmonic transformations. The project is distinguished by its header-only library design and its implementation of specialized deformation techniques, including rigid-as-possible and polyharmonic shape deformation. It also provides a visualization tool for rendering surfaces and scalar fields with interactive scene controls and
Point-e is a system for 3D model synthesis that generates three-dimensional point clouds from natural language descriptions and two-dimensional images. It utilizes diffusion models to synthesize these spatial representations based on text prompts or source images. The project includes specialized tools for refining these outputs, such as a point cloud upsampler to increase the density and resolution of low-resolution models. It also provides a mesh converter that uses distance function regression to transform raw point cloud data into structured 3D meshes. The broader capability surface cove
SAM 3D Objects is a promptable foundation model that recovers 3D objects and human meshes from single images. It converts masked objects in a single photograph into full 3D models with pose, shape, texture, and layout, while also producing complete 3D human body meshes from the same input. The system integrates promptable segmentation to isolate objects and humans before reconstruction, then aligns the independently reconstructed 3D elements into a shared coordinate space. This enables scene-level understanding where multiple 3D reconstructions from the same image coexist in a common coordina
PyTorch3D is a 3D geometric deep learning library and mesh processing toolkit designed for learning from point clouds and complex 3D surface geometries. It provides a collection of reusable components and data structures for deep learning with 3D data, including a framework for training and evaluating neural radiance fields to enable photorealistic view synthesis. The project features a differentiable 3D renderer that converts meshes and point clouds into 2D images while allowing gradients to flow back into the geometry and textures. This enables 3D shape optimization, where mesh geometry, te
This project is a computer vision pipeline and volumetric rendering system used to transform photos and videos into high-fidelity 3D models. It implements a deformable neural radiance field framework that optimizes deformation fields to represent non-rigid moving subjects in three dimensions. The system utilizes volumetric deformation fields to map 3D coordinates from a static canonical space to a deformed state. This allows for the reconstruction of photorealistic scenes and the synthesis of high-fidelity images from camera perspectives not present in the original input data. The framework
Grounded-Segment-Anything is a suite of specialized tools for multimodal visual analysis, text-based segmentation, and generative image editing. It integrates text-to-bounding-box detection and high-precision image segmentation masks to function as a text-based image segmenter and an automated visual labeling tool. The project enables text-driven image editing by identifying objects through natural language to perform inpainting and element replacement. It further extends visual analysis into three dimensions, allowing for 3D human reconstruction and the generation of 3D bounding boxes from t
COLMAP is a 3D scene reconstruction suite and C++ geometry library that implements a full structure-from-motion pipeline. It functions as a GPU-accelerated photogrammetry tool and multi-view stereo framework designed to produce dense 3D geometry and watertight meshes from collections of 2D images. The project distinguishes itself through hardware-accelerated feature extraction and a modular camera modeling system that supports perspective, fisheye, and equirectangular lens types. It employs vocabulary tree image retrieval to efficiently identify similar images in large datasets and provides P
This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu Sebe, Guoying Zhao.
Official code of "HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation", CVPR 2021
| ROMP | BEV | TRACE | | :---: | :---: | :---: | | Monocular, One-stage, Regression of Multiple 3D People (ICCV21) | Putting People in their Place: Monocular Regression of 3D People in Depth (CVPR2022) | TRACE: 5D Temporal Regression of Avatars with Dynamic Cameras in 3D Environments (CVPR2023)…
The code for data preprocessing and model evaluation is borrowed from SemGCN. garyzhao/SemGCN