Customize-It-3D: High-Quality 3D Creation from A Single Image Using
Subject-Specific Knowledge Prior
- URL: http://arxiv.org/abs/2312.11535v2
- Date: Tue, 9 Jan 2024 10:47:40 GMT
- Title: Customize-It-3D: High-Quality 3D Creation from A Single Image Using
Subject-Specific Knowledge Prior
- Authors: Nan Huang, Ting Zhang, Yuhui Yuan, Dong Chen, Shanghang Zhang
- Abstract summary: We present a novel two-stage approach that fully utilizes the information provided by the reference image to establish a customized knowledge prior for image-to-3D generation.
Experiments showcase the superiority of our method, Customize-It-3D, outperforming previous works by a substantial margin.
- Score: 33.45375100074168
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present a novel two-stage approach that fully utilizes the
information provided by the reference image to establish a customized knowledge
prior for image-to-3D generation. While previous approaches primarily rely on a
general diffusion prior, which struggles to yield consistent results with the
reference image, we propose a subject-specific and multi-modal diffusion model.
This model not only aids NeRF optimization by considering the shading mode for
improved geometry but also enhances texture from the coarse results to achieve
superior refinement. Both aspects contribute to faithfully aligning the 3D
content with the subject. Extensive experiments showcase the superiority of our
method, Customize-It-3D, outperforming previous works by a substantial margin.
It produces faithful 360-degree reconstructions with impressive visual quality,
making it well-suited for various applications, including text-to-3D creation.
Related papers
- Towards High-Fidelity 3D Portrait Generation with Rich Details by Cross-View Prior-Aware Diffusion [63.81544586407943]
Single-image 3D portrait generation methods typically employ 2D diffusion models to provide multi-view knowledge, which is then distilled into 3D representations.
We propose a Hybrid Priors Diffsion model, which explicitly and implicitly incorporates multi-view priors as conditions to enhance the status consistency of the generated multi-view portraits.
Experiments demonstrate that our method can produce 3D portraits with accurate geometry and rich details from a single image.
arXiv Detail & Related papers (2024-11-15T17:19:18Z) - DreamPolish: Domain Score Distillation With Progressive Geometry Generation [66.94803919328815]
We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures.
In the geometry construction phase, our approach leverages multiple neural representations to enhance the stability of the synthesis process.
In the texture generation phase, we introduce a novel score distillation objective, namely domain score distillation (DSD), to guide neural representations toward such a domain.
arXiv Detail & Related papers (2024-11-03T15:15:01Z) - Grounded Compositional and Diverse Text-to-3D with Pretrained Multi-View Diffusion Model [65.58911408026748]
We propose Grounded-Dreamer to generate 3D assets that can accurately follow complex, compositional text prompts.
We first advocate leveraging text-guided 4-view images as the bottleneck in the text-to-3D pipeline.
We then introduce an attention refocusing mechanism to encourage text-aligned 4-view image generation.
arXiv Detail & Related papers (2024-04-28T04:05:10Z) - Diffusion Models are Geometry Critics: Single Image 3D Editing Using Pre-Trained Diffusion Priors [24.478875248825563]
We propose a novel image editing technique that enables 3D manipulations on single images.
Our method directly leverages powerful image diffusion models trained on a broad spectrum of text-image pairs.
Our method can generate high-quality 3D-aware image edits with large viewpoint transformations and high appearance and shape consistency with the input image.
arXiv Detail & Related papers (2024-03-18T06:18:59Z) - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features [70.50665869806188]
3DiffTection is a state-of-the-art method for 3D object detection from single images.
We fine-tune a diffusion model to perform novel view synthesis conditioned on a single image.
We further train the model on target data with detection supervision.
arXiv Detail & Related papers (2023-11-07T23:46:41Z) - Wonder3D: Single Image to 3D using Cross-Domain Diffusion [105.16622018766236]
Wonder3D is a novel method for efficiently generating high-fidelity textured meshes from single-view images.
To holistically improve the quality, consistency, and efficiency of image-to-3D tasks, we propose a cross-domain diffusion model.
arXiv Detail & Related papers (2023-10-23T15:02:23Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise
Estimation [43.83459204345063]
We propose a novel approach that combines multiple noise estimation processes with a pretrained 2D diffusion prior.
Results show that the proposed approach can generate high-quality details compared to the baselines.
arXiv Detail & Related papers (2023-07-30T09:46:22Z) - Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion
Prior [36.40582157854088]
In this work, we investigate the problem of creating high-fidelity 3D content from only a single image.
We leverage prior knowledge from a well-trained 2D diffusion model to act as 3D-aware supervision for 3D creation.
Our method presents the first attempt to achieve high-quality 3D creation from a single image for general objects and enables various applications such as text-to-3D creation and texture editing.
arXiv Detail & Related papers (2023-03-24T17:54:22Z) - Geometric Processing for Image-based 3D Object Modeling [2.6397379133308214]
This article focuses on introducing the state-of-the-art methods of three major components of geometric processing: 1) geo-referencing; 2) Image dense matching 3) texture mapping.
The largely automated geometric processing of images in a 3D object reconstruction workflow, is becoming a critical part of the reality-based 3D modeling.
arXiv Detail & Related papers (2021-06-27T18:33:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.