Related papers: Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior

Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior

URL: http://arxiv.org/abs/2303.14184v2
Date: Mon, 3 Apr 2023 07:18:27 GMT
Title: Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior
Authors: Junshu Tang, Tengfei Wang, Bo Zhang, Ting Zhang, Ran Yi, Lizhuang Ma, Dong Chen
Abstract summary: In this work, we investigate the problem of creating high-fidelity 3D content from only a single image. We leverage prior knowledge from a well-trained 2D diffusion model to act as 3D-aware supervision for 3D creation. Our method presents the first attempt to achieve high-quality 3D creation from a single image for general objects and enables various applications such as text-to-3D creation and texture editing.
Score: 36.40582157854088
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this work, we investigate the problem of creating high-fidelity 3D content from only a single image. This is inherently challenging: it essentially involves estimating the underlying 3D geometry while simultaneously hallucinating unseen textures. To address this challenge, we leverage prior knowledge from a well-trained 2D diffusion model to act as 3D-aware supervision for 3D creation. Our approach, Make-It-3D, employs a two-stage optimization pipeline: the first stage optimizes a neural radiance field by incorporating constraints from the reference image at the frontal view and diffusion prior at novel views; the second stage transforms the coarse model into textured point clouds and further elevates the realism with diffusion prior while leveraging the high-quality textures from the reference image. Extensive experiments demonstrate that our method outperforms prior works by a large margin, resulting in faithful reconstructions and impressive visual quality. Our method presents the first attempt to achieve high-quality 3D creation from a single image for general objects and enables various applications such as text-to-3D creation and texture editing.

Related papers

UniTEX: Universal High Fidelity Generative Texturing for 3D Shapes [35.667175445637604]
We present UniTEX, a novel two-stage 3D texture generation framework.<n>UniTEX achieves superior visual quality and texture integrity compared to existing approaches.
arXiv Detail & Related papers (2025-05-29T08:58:41Z)
A Recipe for Generating 3D Worlds From a Single Image [28.396381735501524]
We introduce a recipe for generating immersive 3D worlds from a single image. This approach requires minimal training and uses existing generative models. Tested on both synthetic and real images, our method produces high-quality 3D environments suitable for VR display.
arXiv Detail & Related papers (2025-03-20T18:06:12Z)
InsTex: Indoor Scenes Stylized Texture Synthesis [81.12010726769768]
High-quality textures are crucial for 3D scenes for augmented/virtual reality (ARVR) applications. Current methods suffer from lengthy processing times and visual artifacts. We introduce two-stage architecture designed to generate high-quality textures for 3D scenes.
arXiv Detail & Related papers (2025-01-22T08:37:59Z)
Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild. Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture. We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z)
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models [112.2625368640425]
High-resolution Image-to-3D model (Hi3D) is a new video diffusion based paradigm that redefines a single image to multi-view images as 3D-aware sequential image generation. Hi3D first empowers the pre-trained video diffusion model with 3D-aware prior, yielding multi-view images with low-resolution texture details.
arXiv Detail & Related papers (2024-09-11T17:58:57Z)
Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior [33.45375100074168]
We present a novel two-stage approach that fully utilizes the information provided by the reference image to establish a customized knowledge prior for image-to-3D generation. Experiments showcase the superiority of our method, Customize-It-3D, outperforming previous works by a substantial margin.
arXiv Detail & Related papers (2023-12-15T19:07:51Z)
EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior [59.25950280610409]
We propose a robust high-quality 3D content generation pipeline by exploiting orthogonal-view image guidance. In this paper, we introduce a novel 2D diffusion model that generates an image consisting of four sub-images based on the given text prompt. We also present a 3D synthesis network that can further improve the details of the generated 3D contents.
arXiv Detail & Related papers (2023-08-25T07:39:26Z)
Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models. Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z)
Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors [104.79392615848109]
We present Magic123, a two-stage coarse-to-fine approach for high-quality, textured 3D meshes from a single unposed image. In the first stage, we optimize a neural radiance field to produce a coarse geometry. In the second stage, we adopt a memory-efficient differentiable mesh representation to yield a high-resolution mesh with a visually appealing texture.
arXiv Detail & Related papers (2023-06-30T17:59:08Z)
ARTIC3D: Learning Robust Articulated 3D Shapes from Noisy Web Image Collections [71.46546520120162]
Estimating 3D articulated shapes like animal bodies from monocular images is inherently challenging. We propose ARTIC3D, a self-supervised framework to reconstruct per-instance 3D shapes from a sparse image collection in-the-wild. We produce realistic animations by fine-tuning the rendered shape and texture under rigid part transformations.
arXiv Detail & Related papers (2023-06-07T17:47:50Z)
TextMesh: Generation of Realistic 3D Meshes From Text Prompts [56.2832907275291]
We propose a novel method for generation of highly realistic-looking 3D meshes. To this end, we extend NeRF to employ an SDF backbone, leading to improved 3D mesh extraction.
arXiv Detail & Related papers (2023-04-24T20:29:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.