DisCoScene: Spatially Disentangled Generative Radiance Fields for
Controllable 3D-aware Scene Synthesis
- URL: http://arxiv.org/abs/2212.11984v1
- Date: Thu, 22 Dec 2022 18:59:59 GMT
- Title: DisCoScene: Spatially Disentangled Generative Radiance Fields for
Controllable 3D-aware Scene Synthesis
- Authors: Yinghao Xu, Menglei Chai, Zifan Shi, Sida Peng, Ivan Skorokhodov,
Aliaksandr Siarohin, Ceyuan Yang, Yujun Shen, Hsin-Ying Lee, Bolei Zhou,
Sergey Tulyakov
- Abstract summary: DisCoScene is a 3Daware generative model for high-quality and controllable scene synthesis.
It disentangles the whole scene into object-centric generative fields by learning on only 2D images with the global-local discrimination.
We demonstrate state-of-the-art performance on many scene datasets, including the challenging outdoor dataset.
- Score: 90.32352050266104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing 3D-aware image synthesis approaches mainly focus on generating a
single canonical object and show limited capacity in composing a complex scene
containing a variety of objects. This work presents DisCoScene: a 3Daware
generative model for high-quality and controllable scene synthesis. The key
ingredient of our method is a very abstract object-level representation (i.e.,
3D bounding boxes without semantic annotation) as the scene layout prior, which
is simple to obtain, general to describe various scene contents, and yet
informative to disentangle objects and background. Moreover, it serves as an
intuitive user control for scene editing. Based on such a prior, the proposed
model spatially disentangles the whole scene into object-centric generative
radiance fields by learning on only 2D images with the global-local
discrimination. Our model obtains the generation fidelity and editing
flexibility of individual objects while being able to efficiently compose
objects and the background into a complete scene. We demonstrate
state-of-the-art performance on many scene datasets, including the challenging
Waymo outdoor dataset. Project page:
https://snap-research.github.io/discoscene/
Related papers
- 3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting [100.94916668527544]
Existing methods solely focus on either 2D individual object or 3D global scene editing.
We propose 3DitScene, a novel and unified scene editing framework.
It enables seamless editing from 2D to 3D, allowing precise control over scene composition and individual objects.
arXiv Detail & Related papers (2024-05-28T17:59:01Z) - Disentangled 3D Scene Generation with Layout Learning [109.03233745767062]
We introduce a method to generate 3D scenes that are disentangled into their component objects.
Our key insight is that objects can be discovered by finding parts of a 3D scene that, when rearranged spatially, still produce valid configurations of the same scene.
We show that despite its simplicity, our approach successfully generates 3D scenes into individual objects.
arXiv Detail & Related papers (2024-02-26T18:54:15Z) - Style-Consistent 3D Indoor Scene Synthesis with Decoupled Objects [84.45345829270626]
Controllable 3D indoor scene synthesis stands at the forefront of technological progress.
Current methods for scene stylization are limited to applying styles to the entire scene.
We introduce a unique pipeline designed for synthesis 3D indoor scenes.
arXiv Detail & Related papers (2024-01-24T03:10:36Z) - UrbanGIRAFFE: Representing Urban Scenes as Compositional Generative
Neural Feature Fields [22.180286908121946]
We propose UrbanGIRAFFE, which uses a coarse 3D panoptic prior to guide a 3D-aware generative model.
Our model is compositional and controllable as it breaks down the scene into stuff, objects, and sky.
With proper loss functions, our approach facilitates photorealistic 3D-aware image synthesis with diverse controllability.
arXiv Detail & Related papers (2023-03-24T17:28:07Z) - Set-the-Scene: Global-Local Training for Generating Controllable NeRF
Scenes [68.14127205949073]
We propose a novel GlobalLocal training framework for synthesizing a 3D scene using object proxies.
We show that using proxies allows a wide variety of editing options, such as adjusting the placement of each independent object.
Our results show that Set-the-Scene offers a powerful solution for scene synthesis and manipulation.
arXiv Detail & Related papers (2023-03-23T17:17:29Z) - gCoRF: Generative Compositional Radiance Fields [80.45269080324677]
3D generative models of objects enable photorealistic image synthesis with 3D control.
Existing methods model the scene as a global scene representation, ignoring the compositional aspect of the scene.
We present a compositional generative model, where each semantic part of the object is represented as an independent 3D representation.
arXiv Detail & Related papers (2022-10-31T14:10:44Z) - Learning Object-Compositional Neural Radiance Field for Editable Scene
Rendering [42.37007176376849]
We present a novel neural scene rendering system, which learns an object-compositional neural radiance field and produces realistic rendering for a clustered and real-world scene.
To survive the training in heavily cluttered scenes, we propose a scene-guided training strategy to solve the 3D space ambiguity in the occluded regions and learn sharp boundaries for each object.
arXiv Detail & Related papers (2021-09-04T11:37:18Z) - GIRAFFE: Representing Scenes as Compositional Generative Neural Feature
Fields [45.21191307444531]
Deep generative models allow for photorealistic image synthesis at high resolutions.
But for many applications, this is not enough: content creation also needs to be controllable.
Our key hypothesis is that incorporating a compositional 3D scene representation into the generative model leads to more controllable image synthesis.
arXiv Detail & Related papers (2020-11-24T14:14:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.