HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes
- URL: http://arxiv.org/abs/2210.09729v1
- Date: Tue, 18 Oct 2022 10:14:11 GMT
- Title: HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes
- Authors: Zan Wang, Yixin Chen, Tengyu Liu, Yixin Zhu, Wei Liang, Siyuan Huang
- Abstract summary: We present a novel scene-and-language conditioned generative model that can produce 3D human motions in 3D scenes.
Our experiments demonstrate that our model generates diverse and semantically consistent human motions in 3D scenes.
- Score: 54.61610144668777
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning to generate diverse scene-aware and goal-oriented human motions in
3D scenes remains challenging due to the mediocre characteristics of the
existing datasets on Human-Scene Interaction (HSI); they only have limited
scale/quality and lack semantics. To fill in the gap, we propose a large-scale
and semantic-rich synthetic HSI dataset, denoted as HUMANISE, by aligning the
captured human motion sequences with various 3D indoor scenes. We automatically
annotate the aligned motions with language descriptions that depict the action
and the unique interacting objects in the scene; e.g., sit on the armchair near
the desk. HUMANISE thus enables a new generation task, language-conditioned
human motion generation in 3D scenes. The proposed task is challenging as it
requires joint modeling of the 3D scene, human motion, and natural language. To
tackle this task, we present a novel scene-and-language conditioned generative
model that can produce 3D human motions of the desirable action interacting
with the specified objects. Our experiments demonstrate that our model
generates diverse and semantically consistent human motions in 3D scenes.
Related papers
- Generating Human Motion in 3D Scenes from Text Descriptions [60.04976442328767]
This paper focuses on the task of generating human motions in 3D indoor scenes given text descriptions of the human-scene interactions.
We propose a new approach that decomposes the complex problem into two more manageable sub-problems.
For language grounding of the target object, we leverage the power of large language models; for motion generation, we design an object-centric scene representation.
arXiv Detail & Related papers (2024-05-13T14:30:12Z) - LaserHuman: Language-guided Scene-aware Human Motion Generation in Free Environment [27.38638713080283]
We introduce LaserHuman, a pioneering dataset engineered to revolutionize Scene-Text-to-Motion research.
LaserHuman stands out with its inclusion of genuine human motions within 3D environments.
We propose a multi-conditional diffusion model, which is simple but effective, achieving state-of-the-art performance on existing datasets.
arXiv Detail & Related papers (2024-03-20T05:11:10Z) - ParaHome: Parameterizing Everyday Home Activities Towards 3D Generative
Modeling of Human-Object Interactions [11.32229757116179]
We introduce the ParaHome system, designed to capture dynamic 3D movements of humans and objects within a common home environment.
By leveraging the ParaHome system, we collect a novel large-scale dataset of human-object interaction.
arXiv Detail & Related papers (2024-01-18T18:59:58Z) - Revisit Human-Scene Interaction via Space Occupancy [55.67657438543008]
Human-scene Interaction (HSI) generation is a challenging task and crucial for various downstream tasks.
In this work, we argue that interaction with a scene is essentially interacting with the space occupancy of the scene from an abstract physical perspective.
By treating pure motion sequences as records of humans interacting with invisible scene occupancy, we can aggregate motion-only data into a large-scale paired human-occupancy interaction database.
arXiv Detail & Related papers (2023-12-05T12:03:00Z) - GenZI: Zero-Shot 3D Human-Scene Interaction Generation [39.9039943099911]
We propose GenZI, the first zero-shot approach to generating 3D human-scene interactions.
Key to GenZI is our distillation of interaction priors from large vision-language models (VLMs), which have learned a rich semantic space of 2D human-scene compositions.
In contrast to existing learning-based approaches, GenZI circumvents the conventional need for captured 3D interaction data.
arXiv Detail & Related papers (2023-11-29T15:40:11Z) - Task-Oriented Human-Object Interactions Generation with Implicit Neural
Representations [61.659439423703155]
TOHO: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations.
Our method generates continuous motions that are parameterized only by the temporal coordinate.
This work takes a step further toward general human-scene interaction simulation.
arXiv Detail & Related papers (2023-03-23T09:31:56Z) - MIME: Human-Aware 3D Scene Generation [55.30202416702207]
We generate 3D indoor scenes given 3D human motion.
Human movement indicates the free-space in a room.
Human contact indicates surfaces or objects that support activities such as sitting, lying or touching.
arXiv Detail & Related papers (2022-12-08T15:56:17Z) - Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes [27.443701512923177]
We propose to bridge human motion synthesis and scene affordance reasoning.
We present a hierarchical generative framework to synthesize long-term 3D human motion conditioning on the 3D scene structure.
Our experiments show significant improvements over previous approaches on generating natural and physically plausible human motion in a scene.
arXiv Detail & Related papers (2020-12-10T09:09:38Z) - PLACE: Proximity Learning of Articulation and Contact in 3D Environments [70.50782687884839]
We propose a novel interaction generation method, named PLACE, which explicitly models the proximity between the human body and the 3D scene around it.
Our perceptual study shows that PLACE significantly improves the state-of-the-art method, approaching the realism of real human-scene interaction.
arXiv Detail & Related papers (2020-08-12T21:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.