SceneLinker: Compositional 3D Scene Generation via Semantic Scene Graph from RGB Sequences
- URL: http://arxiv.org/abs/2602.02974v1
- Date: Tue, 03 Feb 2026 01:22:07 GMT
- Title: SceneLinker: Compositional 3D Scene Generation via Semantic Scene Graph from RGB Sequences
- Authors: Seok-Young Kim, Dooyoung Kim, Woojin Cho, Hail Song, Suji Kang, Woontack Woo,
- Abstract summary: We introduce SceneLinker, a framework that generates compositional 3D scenes via semantic scene graph from RGB sequences.<n>Our work enables users to generate consistent 3D spaces from their physical environments via scene graphs, allowing them to create spatial Mixed Reality (MR) content.
- Score: 12.771171646896468
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce SceneLinker, a novel framework that generates compositional 3D scenes via semantic scene graph from RGB sequences. To adaptively experience Mixed Reality (MR) content based on each user's space, it is essential to generate a 3D scene that reflects the real-world layout by compactly capturing the semantic cues of the surroundings. Prior works struggled to fully capture the contextual relationship between objects or mainly focused on synthesizing diverse shapes, making it challenging to generate 3D scenes aligned with object arrangements. We address these challenges by designing a graph network with cross-check feature attention for scene graph prediction and constructing a graph-variational autoencoder (graph-VAE), which consists of a joint shape and layout block for 3D scene generation. Experiments on the 3RScan/3DSSG and SG-FRONT datasets demonstrate that our approach outperforms state-of-the-art methods in both quantitative and qualitative evaluations, even in complex indoor environments and under challenging scene graph constraints. Our work enables users to generate consistent 3D spaces from their physical environments via scene graphs, allowing them to create spatial MR content. Project page is https://scenelinker2026.github.io.
Related papers
- Toward Scene Graph and Layout Guided Complex 3D Scene Generation [31.396230860775415]
We present a novel framework of Scene Graph and Layout Guided 3D Scene Generation (GraLa3D)<n>Given a text prompt describing a complex 3D scene, GraLa3D utilizes LLM to model the scene using a scene graph representation with layout bounding box information.<n>GraLa3D uniquely constructs the scene graph with single-object nodes and composite super-nodes.
arXiv Detail & Related papers (2024-12-29T14:21:03Z) - Planner3D: LLM-enhanced graph prior meets 3D indoor scene explicit regularization [31.52569918586902]
3D scene synthesis has diverse applications across a spectrum of industries such as robotics, films, and video games.
In this paper, we aim at generating realistic and reasonable 3D indoor scenes from scene graph.
Our method achieves better 3D scene synthesis, especially in terms of scene-level fidelity.
arXiv Detail & Related papers (2024-03-19T15:54:48Z) - SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets.
We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z) - 3D Scene Diffusion Guidance using Scene Graphs [3.207455883863626]
We propose a novel approach for 3D scene diffusion guidance using scene graphs.
To leverage the relative spatial information the scene graphs provide, we make use of relational graph convolutional blocks within our denoising network.
arXiv Detail & Related papers (2023-08-08T06:16:37Z) - CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph
Diffusion [83.30168660888913]
We present CommonScenes, a fully generative model that converts scene graphs into corresponding controllable 3D scenes.
Our pipeline consists of two branches, one predicting the overall scene layout via a variational auto-encoder and the other generating compatible shapes.
The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.
arXiv Detail & Related papers (2023-05-25T17:39:13Z) - SGAligner : 3D Scene Alignment with Scene Graphs [84.01002998166145]
Building 3D scene graphs has emerged as a topic in scene representation for several embodied AI applications.
We focus on the fundamental problem of aligning pairs of 3D scene graphs whose overlap can range from zero to partial.
We propose SGAligner, the first method for aligning pairs of 3D scene graphs that is robust to in-the-wild scenarios.
arXiv Detail & Related papers (2023-04-28T14:39:22Z) - Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using
Scene Graphs [85.54212143154986]
Controllable scene synthesis consists of generating 3D information that satisfy underlying specifications.
Scene graphs are representations of a scene composed of objects (nodes) and inter-object relationships (edges)
We propose the first work that directly generates shapes from a scene graph in an end-to-end manner.
arXiv Detail & Related papers (2021-08-19T17:59:07Z) - SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D
Sequences [76.28527350263012]
We propose a method to incrementally build up semantic scene graphs from a 3D environment given a sequence of RGB-D frames.
We aggregate PointNet features from primitive scene components by means of a graph neural network.
Our approach outperforms 3D scene graph prediction methods by a large margin and its accuracy is on par with other 3D semantic and panoptic segmentation methods while running at 35 Hz.
arXiv Detail & Related papers (2021-03-27T13:00:36Z) - Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions [94.17683799712397]
We focus on scene graphs, a data structure that organizes the entities of a scene in a graph.
We propose a learned method that regresses a scene graph from the point cloud of a scene.
We show the application of our method in a domain-agnostic retrieval task, where graphs serve as an intermediate representation for 3D-3D and 2D-3D matching.
arXiv Detail & Related papers (2020-04-08T12:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.