CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph
Diffusion
- URL: http://arxiv.org/abs/2305.16283v5
- Date: Sat, 30 Dec 2023 21:49:22 GMT
- Title: CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph
Diffusion
- Authors: Guangyao Zhai, Evin P{\i}nar \"Ornek, Shun-Cheng Wu, Yan Di, Federico
Tombari, Nassir Navab, Benjamin Busam
- Abstract summary: We present CommonScenes, a fully generative model that converts scene graphs into corresponding controllable 3D scenes.
Our pipeline consists of two branches, one predicting the overall scene layout via a variational auto-encoder and the other generating compatible shapes.
The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.
- Score: 83.30168660888913
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Controllable scene synthesis aims to create interactive environments for
various industrial use cases. Scene graphs provide a highly suitable interface
to facilitate these applications by abstracting the scene context in a compact
manner. Existing methods, reliant on retrieval from extensive databases or
pre-trained shape embeddings, often overlook scene-object and object-object
relationships, leading to inconsistent results due to their limited generation
capacity. To address this issue, we present CommonScenes, a fully generative
model that converts scene graphs into corresponding controllable 3D scenes,
which are semantically realistic and conform to commonsense. Our pipeline
consists of two branches, one predicting the overall scene layout via a
variational auto-encoder and the other generating compatible shapes via latent
diffusion, capturing global scene-object and local inter-object relationships
in the scene graph while preserving shape diversity. The generated scenes can
be manipulated by editing the input scene graph and sampling the noise in the
diffusion model. Due to lacking a scene graph dataset offering high-quality
object-level meshes with relations, we also construct SG-FRONT, enriching the
off-the-shelf indoor dataset 3D-FRONT with additional scene graph labels.
Extensive experiments are conducted on SG-FRONT where CommonScenes shows clear
advantages over other methods regarding generation consistency, quality, and
diversity. Codes and the dataset will be released upon acceptance.
Related papers
- Mixed Diffusion for 3D Indoor Scene Synthesis [55.94569112629208]
We present MiDiffusion, a novel mixed discrete-continuous diffusion model architecture.
We represent a scene layout by a 2D floor plan and a set of objects, each defined by its category, location, size, and orientation.
Our experimental results demonstrate that MiDiffusion substantially outperforms state-of-the-art autoregressive and diffusion models in floor-conditioned 3D scene synthesis.
arXiv Detail & Related papers (2024-05-31T17:54:52Z) - EchoScene: Indoor Scene Generation via Information Echo over Scene Graph Diffusion [77.0556470600979]
We present EchoScene, an interactive and controllable generative model that generates 3D indoor scenes on scene graphs.
Existing methods struggle to handle scene graphs due to varying numbers of nodes, multiple edge combinations, and manipulator-induced node-edge operations.
arXiv Detail & Related papers (2024-05-02T00:04:02Z) - 3D scene generation from scene graphs and self-attention [51.49886604454926]
We present a variant of the conditional variational autoencoder (cVAE) model to synthesize 3D scenes from scene graphs and floor plans.
We exploit the properties of self-attention layers to capture high-level relationships between objects in a scene.
arXiv Detail & Related papers (2024-04-02T12:26:17Z) - Planner3D: LLM-enhanced graph prior meets 3D indoor scene explicit regularization [31.52569918586902]
3D scene synthesis has diverse applications across a spectrum of industries such as robotics, films, and video games.
In this paper, we aim at generating realistic and reasonable 3D indoor scenes from scene graph.
Our method achieves better 3D scene synthesis, especially in terms of scene-level fidelity.
arXiv Detail & Related papers (2024-03-19T15:54:48Z) - 3D Scene Diffusion Guidance using Scene Graphs [3.207455883863626]
We propose a novel approach for 3D scene diffusion guidance using scene graphs.
To leverage the relative spatial information the scene graphs provide, we make use of relational graph convolutional blocks within our denoising network.
arXiv Detail & Related papers (2023-08-08T06:16:37Z) - SGAligner : 3D Scene Alignment with Scene Graphs [84.01002998166145]
Building 3D scene graphs has emerged as a topic in scene representation for several embodied AI applications.
We focus on the fundamental problem of aligning pairs of 3D scene graphs whose overlap can range from zero to partial.
We propose SGAligner, the first method for aligning pairs of 3D scene graphs that is robust to in-the-wild scenarios.
arXiv Detail & Related papers (2023-04-28T14:39:22Z) - Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using
Scene Graphs [85.54212143154986]
Controllable scene synthesis consists of generating 3D information that satisfy underlying specifications.
Scene graphs are representations of a scene composed of objects (nodes) and inter-object relationships (edges)
We propose the first work that directly generates shapes from a scene graph in an end-to-end manner.
arXiv Detail & Related papers (2021-08-19T17:59:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.