Synthesizing Diverse Human Motions in 3D Indoor Scenes
- URL: http://arxiv.org/abs/2305.12411v3
- Date: Mon, 21 Aug 2023 09:07:07 GMT
- Title: Synthesizing Diverse Human Motions in 3D Indoor Scenes
- Authors: Kaifeng Zhao, Yan Zhang, Shaofei Wang, Thabo Beeler, and Siyu Tang
- Abstract summary: We present a novel method for populating 3D indoor scenes with virtual humans that can navigate in the environment and interact with objects in a realistic manner.
Existing approaches rely on training sequences that contain captured human motions and the 3D scenes they interact with.
We propose a reinforcement learning-based approach that enables virtual humans to navigate in 3D scenes and interact with objects realistically and autonomously.
- Score: 16.948649870341782
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel method for populating 3D indoor scenes with virtual humans
that can navigate in the environment and interact with objects in a realistic
manner. Existing approaches rely on training sequences that contain captured
human motions and the 3D scenes they interact with. However, such interaction
data are costly, difficult to capture, and can hardly cover all plausible
human-scene interactions in complex environments. To address these challenges,
we propose a reinforcement learning-based approach that enables virtual humans
to navigate in 3D scenes and interact with objects realistically and
autonomously, driven by learned motion control policies. The motion control
policies employ latent motion action spaces, which correspond to realistic
motion primitives and are learned from large-scale motion capture data using a
powerful generative motion model. For navigation in a 3D environment, we
propose a scene-aware policy with novel state and reward designs for collision
avoidance. Combined with navigation mesh-based path-finding algorithms to
generate intermediate waypoints, our approach enables the synthesis of diverse
human motions navigating in 3D indoor scenes and avoiding obstacles. To
generate fine-grained human-object interactions, we carefully curate
interaction goal guidance using a marker-based body representation and leverage
features based on the signed distance field (SDF) to encode human-scene
proximity relations. Our method can synthesize realistic and diverse
human-object interactions (e.g.,~sitting on a chair and then getting up) even
for out-of-distribution test scenarios with different object shapes,
orientations, starting body positions, and poses. Experimental results
demonstrate that our approach outperforms state-of-the-art methods in terms of
both motion naturalness and diversity. Code and video results are available at:
https://zkf1997.github.io/DIMOS.
Related papers
- Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes [83.55301458112672]
Sitcom-Crafter is a system for human motion generation in 3D space.
Central to the function generation modules is our novel 3D scene-aware human-human interaction module.
Augmentation modules encompass plot comprehension for command generation, motion synchronization for seamless integration of different motion types.
arXiv Detail & Related papers (2024-10-14T17:56:19Z) - Human-Aware 3D Scene Generation with Spatially-constrained Diffusion Models [16.259040755335885]
Previous auto-regression-based 3D scene generation methods have struggled to accurately capture the joint distribution of multiple objects and input humans.
We introduce two spatial collision guidance mechanisms: human-object collision avoidance and object-room boundary constraints.
Our framework can generate more natural and plausible 3D scenes with precise human-scene interactions.
arXiv Detail & Related papers (2024-06-26T08:18:39Z) - Generating Human Interaction Motions in Scenes with Text Control [66.74298145999909]
We present TeSMo, a method for text-controlled scene-aware motion generation based on denoising diffusion models.
Our approach begins with pre-training a scene-agnostic text-to-motion diffusion model.
To facilitate training, we embed annotated navigation and interaction motions within scenes.
arXiv Detail & Related papers (2024-04-16T16:04:38Z) - Controllable Human-Object Interaction Synthesis [77.56877961681462]
We propose Controllable Human-Object Interaction Synthesis (CHOIS) to generate synchronized object motion and human motion in 3D scenes.
Here, language descriptions inform style and intent, and waypoints, which can be effectively extracted from high-level planning, ground the motion in the scene.
Our module seamlessly integrates with a path planning module, enabling the generation of long-term interactions in 3D environments.
arXiv Detail & Related papers (2023-12-06T21:14:20Z) - Revisit Human-Scene Interaction via Space Occupancy [55.67657438543008]
Human-scene Interaction (HSI) generation is a challenging task and crucial for various downstream tasks.
In this work, we argue that interaction with a scene is essentially interacting with the space occupancy of the scene from an abstract physical perspective.
By treating pure motion sequences as records of humans interacting with invisible scene occupancy, we can aggregate motion-only data into a large-scale paired human-occupancy interaction database.
arXiv Detail & Related papers (2023-12-05T12:03:00Z) - Synthesizing Physically Plausible Human Motions in 3D Scenes [41.1310197485928]
We present a framework that enables physically simulated characters to perform long-term interaction tasks in diverse, cluttered, and unseen scenes.
Specifically, InterCon contains two complementary policies that enable characters to enter and leave the interacting state.
To generate interaction with objects at different places, we further design NavCon, a trajectory following policy, to keep characters' motions in the free space of 3D scenes.
arXiv Detail & Related papers (2023-08-17T15:17:49Z) - CIRCLE: Capture In Rich Contextual Environments [69.97976304918149]
We propose a novel motion acquisition system in which the actor perceives and operates in a highly contextual virtual world.
We present CIRCLE, a dataset containing 10 hours of full-body reaching motion from 5 subjects across nine scenes.
We use this dataset to train a model that generates human motion conditioned on scene information.
arXiv Detail & Related papers (2023-03-31T09:18:12Z) - Locomotion-Action-Manipulation: Synthesizing Human-Scene Interactions in
Complex 3D Environments [11.87902527509297]
We present LAMA, Locomotion-Action-MAnipulation, to synthesize natural and plausible long-term human movements in complex indoor environments.
Unlike existing methods that require motion data "paired" with scanned 3D scenes for supervision, we formulate the problem as a test-time optimization by using human motion capture data only for synthesis.
arXiv Detail & Related papers (2023-01-09T18:59:16Z) - Stochastic Scene-Aware Motion Prediction [41.6104600038666]
We present a novel data-driven, synthesis motion method that models different styles of performing a given action with a target object.
Our method, called SAMP, for SceneAware Motion Prediction, generalizes to target objects of various geometries while enabling the character to navigate in cluttered scenes.
arXiv Detail & Related papers (2021-08-18T17:56:17Z) - PLACE: Proximity Learning of Articulation and Contact in 3D Environments [70.50782687884839]
We propose a novel interaction generation method, named PLACE, which explicitly models the proximity between the human body and the 3D scene around it.
Our perceptual study shows that PLACE significantly improves the state-of-the-art method, approaching the realism of real human-scene interaction.
arXiv Detail & Related papers (2020-08-12T21:00:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.