Scene Synthesis from Human Motion
- URL: http://arxiv.org/abs/2301.01424v1
- Date: Wed, 4 Jan 2023 03:30:46 GMT
- Title: Scene Synthesis from Human Motion
- Authors: Sifan Ye, Yixing Wang, Jiaman Li, Dennis Park, C. Karen Liu, Huazhe
Xu, Jiajun Wu
- Abstract summary: We propose to synthesize diverse, semantically reasonable, and physically plausible scenes based on human motion.
Our framework, Scene Synthesis from HUMan MotiON (MONSUM), includes two steps.
It first uses ContactFormer, our newly introduced contact predictor, to obtain temporally consistent contact labels from human motion.
- Score: 26.2618553074691
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large-scale capture of human motion with diverse, complex scenes, while
immensely useful, is often considered prohibitively costly. Meanwhile, human
motion alone contains rich information about the scene they reside in and
interact with. For example, a sitting human suggests the existence of a chair,
and their leg position further implies the chair's pose. In this paper, we
propose to synthesize diverse, semantically reasonable, and physically
plausible scenes based on human motion. Our framework, Scene Synthesis from
HUMan MotiON (SUMMON), includes two steps. It first uses ContactFormer, our
newly introduced contact predictor, to obtain temporally consistent contact
labels from human motion. Based on these predictions, SUMMON then chooses
interacting objects and optimizes physical plausibility losses; it further
populates the scene with objects that do not interact with humans. Experimental
results demonstrate that SUMMON synthesizes feasible, plausible, and diverse
scenes and has the potential to generate extensive human-scene interaction data
for the community.
Related papers
- SynPlay: Importing Real-world Diversity for a Synthetic Human Dataset [19.32308498024933]
We introduce Synthetic Playground (SynPlay), a new synthetic human dataset that aims to bring out the diversity of human appearance in the real world.
We focus on two factors to achieve a level of diversity that has not yet been seen in previous works: realistic human motions and poses.
We show that using SynPlay in model training leads to enhanced accuracy over existing synthetic datasets for human detection and segmentation.
arXiv Detail & Related papers (2024-08-21T17:58:49Z) - Revisit Human-Scene Interaction via Space Occupancy [55.67657438543008]
Human-scene Interaction (HSI) generation is a challenging task and crucial for various downstream tasks.
In this work, we argue that interaction with a scene is essentially interacting with the space occupancy of the scene from an abstract physical perspective.
By treating pure motion sequences as records of humans interacting with invisible scene occupancy, we can aggregate motion-only data into a large-scale paired human-occupancy interaction database.
arXiv Detail & Related papers (2023-12-05T12:03:00Z) - ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions [66.87211993793807]
We present ReMoS, a denoising diffusion based model that synthesizes full body motion of a person in two person interaction scenario.
We demonstrate ReMoS across challenging two person scenarios such as pair dancing, Ninjutsu, kickboxing, and acrobatics.
We also contribute the ReMoCap dataset for two person interactions containing full body and finger motions.
arXiv Detail & Related papers (2023-11-28T18:59:52Z) - IMoS: Intent-Driven Full-Body Motion Synthesis for Human-Object
Interactions [69.95820880360345]
We present the first framework to synthesize the full-body motion of virtual human characters with 3D objects placed within their reach.
Our system takes as input textual instructions specifying the objects and the associated intentions of the virtual characters.
We show that our synthesized full-body motions appear more realistic to the participants in more than 80% of scenarios.
arXiv Detail & Related papers (2022-12-14T23:59:24Z) - HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes [54.61610144668777]
We present a novel scene-and-language conditioned generative model that can produce 3D human motions in 3D scenes.
Our experiments demonstrate that our model generates diverse and semantically consistent human motions in 3D scenes.
arXiv Detail & Related papers (2022-10-18T10:14:11Z) - Contact-aware Human Motion Forecasting [87.04827994793823]
We tackle the task of scene-aware 3D human motion forecasting, which consists of predicting future human poses given a 3D scene and a past human motion.
Our approach outperforms the state-of-the-art human motion forecasting and human synthesis methods on both synthetic and real datasets.
arXiv Detail & Related papers (2022-10-08T07:53:19Z) - Towards Diverse and Natural Scene-aware 3D Human Motion Synthesis [117.15586710830489]
We focus on the problem of synthesizing diverse scene-aware human motions under the guidance of target action sequences.
Based on this factorized scheme, a hierarchical framework is proposed, with each sub-module responsible for modeling one aspect.
Experiment results show that the proposed framework remarkably outperforms previous methods in terms of diversity and naturalness.
arXiv Detail & Related papers (2022-05-25T18:20:01Z) - Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes [27.443701512923177]
We propose to bridge human motion synthesis and scene affordance reasoning.
We present a hierarchical generative framework to synthesize long-term 3D human motion conditioning on the 3D scene structure.
Our experiments show significant improvements over previous approaches on generating natural and physically plausible human motion in a scene.
arXiv Detail & Related papers (2020-12-10T09:09:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.