3DDesigner: Towards Photorealistic 3D Object Generation and Editing with
Text-guided Diffusion Models
- URL: http://arxiv.org/abs/2211.14108v3
- Date: Thu, 12 Oct 2023 08:05:18 GMT
- Title: 3DDesigner: Towards Photorealistic 3D Object Generation and Editing with
Text-guided Diffusion Models
- Authors: Gang Li, Heliang Zheng, Chaoyue Wang, Chang Li, Changwen Zheng,
Dacheng Tao
- Abstract summary: We equip text-guided diffusion models to achieve 3D-consistent generation.
We study 3D local editing and propose a two-step solution.
We extend our model to perform one-shot novel view synthesis.
- Score: 71.25937799010407
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-guided diffusion models have shown superior performance in image/video
generation and editing. While few explorations have been performed in 3D
scenarios. In this paper, we discuss three fundamental and interesting problems
on this topic. First, we equip text-guided diffusion models to achieve
3D-consistent generation. Specifically, we integrate a NeRF-like neural field
to generate low-resolution coarse results for a given camera view. Such results
can provide 3D priors as condition information for the following diffusion
process. During denoising diffusion, we further enhance the 3D consistency by
modeling cross-view correspondences with a novel two-stream (corresponding to
two different views) asynchronous diffusion process. Second, we study 3D local
editing and propose a two-step solution that can generate 360-degree
manipulated results by editing an object from a single view. Step 1, we propose
to perform 2D local editing by blending the predicted noises. Step 2, we
conduct a noise-to-text inversion process that maps 2D blended noises into the
view-independent text embedding space. Once the corresponding text embedding is
obtained, 360-degree images can be generated. Last but not least, we extend our
model to perform one-shot novel view synthesis by fine-tuning on a single
image, firstly showing the potential of leveraging text guidance for novel view
synthesis. Extensive experiments and various applications show the prowess of
our 3DDesigner. The project page is available at
https://3ddesigner-diffusion.github.io/.
Related papers
- 3DEgo: 3D Editing on the Go! [6.072473323242202]
We introduce 3DEgo to address a novel problem of directly synthesizing 3D scenes from monocular videos guided by textual prompts.
Our framework streamlines the conventional multi-stage 3D editing process into a single-stage workflow.
3DEgo demonstrates remarkable editing precision, speed, and adaptability across a variety of video sources.
arXiv Detail & Related papers (2024-07-14T07:03:50Z) - Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - Denoising Diffusion via Image-Based Rendering [54.20828696348574]
We introduce the first diffusion model able to perform fast, detailed reconstruction and generation of real-world 3D scenes.
First, we introduce a new neural scene representation, IB-planes, that can efficiently and accurately represent large 3D scenes.
Second, we propose a denoising-diffusion framework to learn a prior over this novel 3D scene representation, using only 2D images.
arXiv Detail & Related papers (2024-02-05T19:00:45Z) - EfficientDreamer: High-Fidelity and Robust 3D Creation via Orthogonal-view Diffusion Prior [59.25950280610409]
We propose a robust high-quality 3D content generation pipeline by exploiting orthogonal-view image guidance.
In this paper, we introduce a novel 2D diffusion model that generates an image consisting of four sub-images based on the given text prompt.
We also present a 3D synthesis network that can further improve the details of the generated 3D contents.
arXiv Detail & Related papers (2023-08-25T07:39:26Z) - Blocks2World: Controlling Realistic Scenes with Editable Primitives [5.541644538483947]
We present Blocks2World, a novel method for 3D scene rendering and editing.
Our technique begins by extracting 3D parallelepipeds from various objects in a given scene using convex decomposition.
The next stage involves training a conditioned model that learns to generate images from the 2D-rendered convex primitives.
arXiv Detail & Related papers (2023-07-07T21:38:50Z) - DreamSparse: Escaping from Plato's Cave with 2D Frozen Diffusion Model
Given Sparse Views [20.685453627120832]
Existing methods often struggle with producing high-quality results or necessitate per-object optimization in such few-view settings.
DreamSparse is capable of synthesizing high-quality novel views for both object and scene-level images.
arXiv Detail & Related papers (2023-06-06T05:26:26Z) - Generative Novel View Synthesis with 3D-Aware Diffusion Models [96.78397108732233]
We present a diffusion-based model for 3D-aware generative novel view synthesis from as few as a single input image.
Our method makes use of existing 2D diffusion backbones but, crucially, incorporates geometry priors in the form of a 3D feature volume.
In addition to generating novel views, our method has the ability to autoregressively synthesize 3D-consistent sequences.
arXiv Detail & Related papers (2023-04-05T17:15:47Z) - RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and
Generation [68.06991943974195]
We present RenderDiffusion, the first diffusion model for 3D generation and inference, trained using only monocular 2D supervision.
We evaluate RenderDiffusion on FFHQ, AFHQ, ShapeNet and CLEVR datasets, showing competitive performance for generation of 3D scenes and inference of 3D scenes from 2D images.
arXiv Detail & Related papers (2022-11-17T20:17:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.