Functional 3D Scene Synthesis through Human-Scene Optimization
- URL: http://arxiv.org/abs/2502.06819v1
- Date: Wed, 05 Feb 2025 04:00:24 GMT
- Title: Functional 3D Scene Synthesis through Human-Scene Optimization
- Authors: Yao Wei, Matteo Toso, Pietro Morerio, Michael Ying Yang, Alessio Del Bue,
- Abstract summary: Our approach is based on a simple, but effective principle: we condition scene synthesis to generate rooms that are usable by humans.
If this human-centric scene generation is viable, the room layout is functional and it leads to a more coherent 3D structure.
- Score: 30.910671968876024
- License:
- Abstract: This paper presents a novel generative approach that outputs 3D indoor environments solely from a textual description of the scene. Current methods often treat scene synthesis as a mere layout prediction task, leading to rooms with overlapping objects or overly structured scenes, with limited consideration of the practical usability of the generated environment. Instead, our approach is based on a simple, but effective principle: we condition scene synthesis to generate rooms that are usable by humans. This principle is implemented by synthesizing 3D humans that interact with the objects composing the scene. If this human-centric scene generation is viable, the room layout is functional and it leads to a more coherent 3D structure. To this end, we propose a novel method for functional 3D scene synthesis, which consists of reasoning, 3D assembling and optimization. We regard text guided 3D synthesis as a reasoning process by generating a scene graph via a graph diffusion network. Considering object functional co-occurrence, a new strategy is designed to better accommodate human-object interaction and avoidance, achieving human-aware 3D scene optimization. We conduct both qualitative and quantitative experiments to validate the effectiveness of our method in generating coherent 3D scene synthesis results.
Related papers
- InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with
Semantic Graph Prior [27.773451301040424]
InstructScene is a novel generative framework that integrates a semantic graph prior and a layout decoder.
We show that the proposed method surpasses existing state-of-the-art approaches by a large margin.
arXiv Detail & Related papers (2024-02-07T10:09:00Z) - SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets.
We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z) - GenZI: Zero-Shot 3D Human-Scene Interaction Generation [39.9039943099911]
We propose GenZI, the first zero-shot approach to generating 3D human-scene interactions.
Key to GenZI is our distillation of interaction priors from large vision-language models (VLMs), which have learned a rich semantic space of 2D human-scene compositions.
In contrast to existing learning-based approaches, GenZI circumvents the conventional need for captured 3D interaction data.
arXiv Detail & Related papers (2023-11-29T15:40:11Z) - Physically Plausible 3D Human-Scene Reconstruction from Monocular RGB
Image using an Adversarial Learning Approach [26.827712050966]
A key challenge in holistic 3D human-scene reconstruction is to generate a physically plausible 3D scene from a single monocular RGB image.
This paper proposes using an implicit feature representation of the scene elements to distinguish a physically plausible alignment of humans and objects.
Unlike the existing inference-time optimization-based approaches, we use this adversarially trained model to produce a per-frame 3D reconstruction of the scene.
arXiv Detail & Related papers (2023-07-27T01:07:15Z) - RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent
Geometry and Texture [80.0643976406225]
We propose "RoomDreamer", which leverages powerful natural language to synthesize a new room with a different style.
Our work addresses the challenge of synthesizing both geometry and texture aligned to the input scene structure and prompt simultaneously.
To validate the proposed method, real indoor scenes scanned with smartphones are used for extensive experiments.
arXiv Detail & Related papers (2023-05-18T22:57:57Z) - Human-Aware Object Placement for Visual Environment Reconstruction [63.14733166375534]
We show that human-scene interactions can be leveraged to improve the 3D reconstruction of a scene from a monocular RGB video.
Our key idea is that, as a person moves through a scene and interacts with it, we accumulate HSIs across multiple input images.
We show that our scene reconstruction can be used to refine the initial 3D human pose and shape estimation.
arXiv Detail & Related papers (2022-03-07T18:59:02Z) - Realistic Image Synthesis with Configurable 3D Scene Layouts [59.872657806747576]
We propose a novel approach to realistic-looking image synthesis based on a 3D scene layout.
Our approach takes a 3D scene with semantic class labels as input and trains a 3D scene painting network.
With the trained painting network, realistic-looking images for the input 3D scene can be rendered and manipulated.
arXiv Detail & Related papers (2021-08-23T09:44:56Z) - PLACE: Proximity Learning of Articulation and Contact in 3D Environments [70.50782687884839]
We propose a novel interaction generation method, named PLACE, which explicitly models the proximity between the human body and the 3D scene around it.
Our perceptual study shows that PLACE significantly improves the state-of-the-art method, approaching the realism of real human-scene interaction.
arXiv Detail & Related papers (2020-08-12T21:00:10Z) - Long-term Human Motion Prediction with Scene Context [60.096118270451974]
We propose a novel three-stage framework for predicting human motion.
Our method first samples multiple human motion goals, then plans 3D human paths towards each goal, and finally predicts 3D human pose sequences following each path.
arXiv Detail & Related papers (2020-07-07T17:59:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.