SceneWiz3D: Towards Text-guided 3D Scene Composition
- URL: http://arxiv.org/abs/2312.08885v1
- Date: Wed, 13 Dec 2023 18:59:30 GMT
- Title: SceneWiz3D: Towards Text-guided 3D Scene Composition
- Authors: Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin, Peiye Zhuang,
Yinghao Xu, Ceyuan Yang, Dahua Lin, Bolei Zhou, Sergey Tulyakov, Hsin-Ying
Lee
- Abstract summary: Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets.
We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
- Score: 134.71933134180782
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We are witnessing significant breakthroughs in the technology for generating
3D objects from text. Existing approaches either leverage large text-to-image
models to optimize a 3D representation or train 3D generators on object-centric
datasets. Generating entire scenes, however, remains very challenging as a
scene contains multiple 3D objects, diverse and scattered. In this work, we
introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes
from text. We marry the locality of objects with globality of scenes by
introducing a hybrid 3D representation: explicit for objects and implicit for
scenes. Remarkably, an object, being represented explicitly, can be either
generated from text using conventional text-to-3D approaches, or provided by
users. To configure the layout of the scene and automatically place objects, we
apply the Particle Swarm Optimization technique during the optimization
process. Furthermore, it is difficult for certain parts of the scene (e.g.,
corners, occlusion) to receive multi-view supervision, leading to inferior
geometry. We incorporate an RGBD panorama diffusion model to mitigate it,
resulting in high-quality geometry. Extensive evaluation supports that our
approach achieves superior quality over previous approaches, enabling the
generation of detailed and view-consistent 3D scenes.
Related papers
- Zero-Shot Multi-Object Scene Completion [59.325611678171974]
We present a 3D scene completion method that recovers the complete geometry of multiple unseen objects in complex scenes from a single RGB-D image.
Our method outperforms the current state-of-the-art on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-03-21T17:59:59Z) - 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene.
Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z) - TeMO: Towards Text-Driven 3D Stylization for Multi-Object Meshes [67.5351491691866]
We present a novel framework, dubbed TeMO, to parse multi-object 3D scenes and edit their styles.
Our method can synthesize high-quality stylized content and outperform the existing methods over a wide range of multi-object 3D meshes.
arXiv Detail & Related papers (2023-12-07T12:10:05Z) - Generating Visual Spatial Description via Holistic 3D Scene
Understanding [88.99773815159345]
Visual spatial description (VSD) aims to generate texts that describe the spatial relations of the given objects within images.
With an external 3D scene extractor, we obtain the 3D objects and scene features for input images.
We construct a target object-centered 3D spatial scene graph (Go3D-S2G), such that we model the spatial semantics of target objects within the holistic 3D scenes.
arXiv Detail & Related papers (2023-05-19T15:53:56Z) - CC3D: Layout-Conditioned Generation of Compositional 3D Scenes [49.281006972028194]
We introduce CC3D, a conditional generative model that synthesizes complex 3D scenes conditioned on 2D semantic scene layouts.
Our evaluations on synthetic 3D-FRONT and real-world KITTI-360 datasets demonstrate that our model generates scenes of improved visual and geometric quality.
arXiv Detail & Related papers (2023-03-21T17:59:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.