TRELLIS is a 3D generative AI model and latent diffusion framework designed to transform natural language descriptions or reference images into textured 3D assets. It operates as a text-to-3D asset generator that utilizes structured latent representations to produce high-quality 3D meshes, Gaussians, and Radiance Fields. The system functions as a multi-format 3D decoder, converting internal representations into standard exchange formats such as GLB and PLY. It also serves as a 3D asset editing tool, enabling the modification of specific regions of generated objects through targeted text or im
This project is a diffusion-based 3D generator and image-to-3D reconstruction system. It translates natural language descriptions or two-dimensional images into three-dimensional assets using neural radiance fields and diffusion models. The system utilizes score-distillation sampling and diffusion-based guidance to refine 3D shapes without requiring 3D training data. It includes specialized tools for transforming neural representations into exportable meshes with texture and material data, as well as a pipeline for iterative optimization of geometry and textures. The project covers a broad r
Point-e is a system for 3D model synthesis that generates three-dimensional point clouds from natural language descriptions and two-dimensional images. It utilizes diffusion models to synthesize these spatial representations based on text prompts or source images. The project includes specialized tools for refining these outputs, such as a point cloud upsampler to increase the density and resolution of low-resolution models. It also provides a mesh converter that uses distance function regression to transform raw point cloud data into structured 3D meshes. The broader capability surface cove
Shap-E is a generative 3D modeling system that creates three-dimensional digital assets from natural language descriptions or two-dimensional images. It functions as a generative model capable of producing three-dimensional implicit functions and assets. The project includes a 3D latent encoder that converts trimeshes and 3D models into latent representations using point clouds and multiview renders. It utilizes an image-to-3D generator to produce assets from synthetic view images and a text-to-3D generator to build shapes from text prompts. The system implements a pipeline involving latent