DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model
- URL: http://arxiv.org/abs/2304.02827v1
- Date: Thu, 6 Apr 2023 02:27:22 GMT
- Title: DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model
- Authors: Hoigi Seo, Hayeon Kim, Gwanghyun Kim, Se Young Chun
- Abstract summary: We propose a novel pipeline to generate a high-quality 3D NeRF model from a text prompt or a single image.
DitTO-NeRF consists of constructing high-quality partial 3D object for limited in-boundary (IB) angles using the given or text-generated 2D image from the frontal view.
We propose progressive 3D object reconstruction schemes in terms of scales (low to high resolution), angles (IB angles initially to outer-boundary (OB) later, and masks (object to background boundary) in our DITTO-NeRF.
- Score: 15.091263190886337
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The increasing demand for high-quality 3D content creation has motivated the
development of automated methods for creating 3D object models from a single
image and/or from a text prompt. However, the reconstructed 3D objects using
state-of-the-art image-to-3D methods still exhibit low correspondence to the
given image and low multi-view consistency. Recent state-of-the-art text-to-3D
methods are also limited, yielding 3D samples with low diversity per prompt
with long synthesis time. To address these challenges, we propose DITTO-NeRF, a
novel pipeline to generate a high-quality 3D NeRF model from a text prompt or a
single image. Our DITTO-NeRF consists of constructing high-quality partial 3D
object for limited in-boundary (IB) angles using the given or text-generated 2D
image from the frontal view and then iteratively reconstructing the remaining
3D NeRF using inpainting latent diffusion model. We propose progressive 3D
object reconstruction schemes in terms of scales (low to high resolution),
angles (IB angles initially to outer-boundary (OB) later), and masks (object to
background boundary) in our DITTO-NeRF so that high-quality information on IB
can be propagated into OB. Our DITTO-NeRF outperforms state-of-the-art methods
in terms of fidelity and diversity qualitatively and quantitatively with much
faster training times than prior arts on image/text-to-3D such as DreamFusion,
and NeuralLift-360.
Related papers
- LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation [73.36690511083894]
This paper introduces a novel framework called LN3Diff to address a unified 3D diffusion pipeline.
Our approach harnesses a 3D-aware architecture and variational autoencoder to encode the input image into a structured, compact, and 3D latent space.
It achieves state-of-the-art performance on ShapeNet for 3D generation and demonstrates superior performance in monocular 3D reconstruction and conditional 3D generation.
arXiv Detail & Related papers (2024-03-18T17:54:34Z) - 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene.
Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z) - ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models [65.22994156658918]
We present a method that learns to generate multi-view images in a single denoising process from real-world data.
We design an autoregressive generation that renders more 3D-consistent images at any viewpoint.
arXiv Detail & Related papers (2024-03-04T07:57:05Z) - GO-NeRF: Generating Virtual Objects in Neural Radiance Fields [75.13534508391852]
GO-NeRF is capable of utilizing scene context for high-quality and harmonious 3D object generation within an existing NeRF.
Our method employs a compositional rendering formulation that allows the generated 3D objects to be seamlessly composited into the scene.
arXiv Detail & Related papers (2024-01-11T08:58:13Z) - PI3D: Efficient Text-to-3D Generation with Pseudo-Image Diffusion [18.82883336156591]
We present PI3D, a framework that fully leverages the pre-trained text-to-image diffusion models' ability to generate high-quality 3D shapes from text prompts in minutes.
PI3D generates a single 3D shape from text in only 3 minutes and the quality is validated to outperform existing 3D generative models by a large margin.
arXiv Detail & Related papers (2023-12-14T16:04:34Z) - EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior [59.25950280610409]
We propose a robust high-quality 3D content generation pipeline by exploiting orthogonal-view image guidance.
In this paper, we introduce a novel 2D diffusion model that generates an image consisting of four sub-images based on the given text prompt.
We also present a 3D synthesis network that can further improve the details of the generated 3D contents.
arXiv Detail & Related papers (2023-08-25T07:39:26Z) - TextMesh: Generation of Realistic 3D Meshes From Text Prompts [56.2832907275291]
We propose a novel method for generation of highly realistic-looking 3D meshes.
To this end, we extend NeRF to employ an SDF backbone, leading to improved 3D mesh extraction.
arXiv Detail & Related papers (2023-04-24T20:29:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.