EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior
- URL: http://arxiv.org/abs/2308.13223v2
- Date: Thu, 21 Mar 2024 06:45:32 GMT
- Title: EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior
- Authors: Zhipeng Hu, Minda Zhao, Chaoyi Zhao, Xinyue Liang, Lincheng Li, Zeng Zhao, Changjie Fan, Xiaowei Zhou, Xin Yu,
- Abstract summary: We propose a robust high-quality 3D content generation pipeline by exploiting orthogonal-view image guidance.
In this paper, we introduce a novel 2D diffusion model that generates an image consisting of four sub-images based on the given text prompt.
We also present a 3D synthesis network that can further improve the details of the generated 3D contents.
- Score: 59.25950280610409
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While image diffusion models have made significant progress in text-driven 3D content creation, they often fail to accurately capture the intended meaning of text prompts, especially for view information. This limitation leads to the Janus problem, where multi-faced 3D models are generated under the guidance of such diffusion models. In this paper, we propose a robust high-quality 3D content generation pipeline by exploiting orthogonal-view image guidance. First, we introduce a novel 2D diffusion model that generates an image consisting of four orthogonal-view sub-images based on the given text prompt. Then, the 3D content is created using this diffusion model. Notably, the generated orthogonal-view image provides strong geometric structure priors and thus improves 3D consistency. As a result, it effectively resolves the Janus problem and significantly enhances the quality of 3D content creation. Additionally, we present a 3D synthesis fusion network that can further improve the details of the generated 3D contents. Both quantitative and qualitative evaluations demonstrate that our method surpasses previous text-to-3D techniques. Project page: https://efficientdreamer.github.io.
Related papers
- Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion [18.82883336156591]
We present PI3D, a framework that fully leverages the pre-trained text-to-image diffusion models' ability to generate high-quality 3D shapes from text prompts in minutes.
PI3D generates a single 3D shape from text in only 3 minutes and the quality is validated to outperform existing 3D generative models by a large margin.
arXiv Detail & Related papers (2023-12-14T16:04:34Z) - Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D
Prior [52.44678180286886]
2D diffusion models find a distillation approach that achieves excellent generalization and rich details without any 3D data.
We propose Sherpa3D, a new text-to-3D framework that achieves high-fidelity, generalizability, and geometric consistency simultaneously.
arXiv Detail & Related papers (2023-12-11T18:59:18Z) - X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation [61.48050470095969]
X-Dreamer is a novel approach for high-quality text-to-3D content creation.
It bridges the gap between text-to-2D and text-to-3D synthesis.
arXiv Detail & Related papers (2023-11-30T07:23:00Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - TextMesh: Generation of Realistic 3D Meshes From Text Prompts [56.2832907275291]
We propose a novel method for generation of highly realistic-looking 3D meshes.
To this end, we extend NeRF to employ an SDF backbone, leading to improved 3D mesh extraction.
arXiv Detail & Related papers (2023-04-24T20:29:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.