Scene-aware Human Motion Forecasting via Mutual Distance Prediction
- URL: http://arxiv.org/abs/2310.00615v3
- Date: Thu, 4 Apr 2024 09:18:50 GMT
- Title: Scene-aware Human Motion Forecasting via Mutual Distance Prediction
- Authors: Chaoyue Xing, Wei Mao, Miaomiao Liu,
- Abstract summary: We propose to model the human-scene interaction with the mutual distance between the human body and the scene.
Such mutual distances constrain both the local and global human motion, resulting in a whole-body motion constrained prediction.
We develop a pipeline with two sequential steps: predicting the future mutual distances first, followed by forecasting future human motion.
- Score: 13.067687949642641
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we tackle the problem of scene-aware 3D human motion forecasting. A key challenge of this task is to predict future human motions that are consistent with the scene by modeling the human-scene interactions. While recent works have demonstrated that explicit constraints on human-scene interactions can prevent the occurrence of ghost motion, they only provide constraints on partial human motion e.g., the global motion of the human or a few joints contacting the scene, leaving the rest of the motion unconstrained. To address this limitation, we propose to model the human-scene interaction with the mutual distance between the human body and the scene. Such mutual distances constrain both the local and global human motion, resulting in a whole-body motion constrained prediction. In particular, mutual distance constraints consist of two components, the signed distance of each vertex on the human mesh to the scene surface and the distance of basis scene points to the human mesh. We further introduce a global scene representation learned from a signed distance function (SDF) volume to ensure coherence between the global scene representation and the explicit constraint from the mutual distance. We develop a pipeline with two sequential steps: predicting the future mutual distances first, followed by forecasting future human motion. During training, we explicitly encourage consistency between predicted poses and mutual distances. Extensive evaluations on the existing synthetic and real datasets demonstrate that our approach consistently outperforms the state-of-the-art methods.
Related papers
- Multimodal Sense-Informed Prediction of 3D Human Motions [16.71099574742631]
This work introduces a novel multi-modal sense-informed motion prediction approach, which conditions high-fidelity generation on two modal information.
The gaze information is regarded as the human intention, and combined with both motion and scene features, we construct a ternary intention-aware attention to supervise the generation.
On two real-world benchmarks, the proposed method achieves state-of-the-art performance both in 3D human pose and trajectory prediction.
arXiv Detail & Related papers (2024-05-05T12:38:10Z) - Staged Contact-Aware Global Human Motion Forecasting [7.930326095134298]
Scene-aware global human motion forecasting is critical for manifold applications, including virtual reality, robotics, and sports.
We propose a STAGed contact-aware global human motion forecasting STAG, a novel three-stage pipeline for predicting global human motion in a 3D environment.
STAG achieves a 1.8% and 16.2% overall improvement in pose and trajectory prediction, respectively, on the scene-aware GTA-IM dataset.
arXiv Detail & Related papers (2023-09-16T10:47:48Z) - Task-Oriented Human-Object Interactions Generation with Implicit Neural
Representations [61.659439423703155]
TOHO: Task-Oriented Human-Object Interactions Generation with Implicit Neural Representations.
Our method generates continuous motions that are parameterized only by the temporal coordinate.
This work takes a step further toward general human-scene interaction simulation.
arXiv Detail & Related papers (2023-03-23T09:31:56Z) - Contact-aware Human Motion Forecasting [87.04827994793823]
We tackle the task of scene-aware 3D human motion forecasting, which consists of predicting future human poses given a 3D scene and a past human motion.
Our approach outperforms the state-of-the-art human motion forecasting and human synthesis methods on both synthetic and real datasets.
arXiv Detail & Related papers (2022-10-08T07:53:19Z) - GIMO: Gaze-Informed Human Motion Prediction in Context [75.52839760700833]
We propose a large-scale human motion dataset that delivers high-quality body pose sequences, scene scans, and ego-centric views with eye gaze.
Our data collection is not tied to specific scenes, which further boosts the motion dynamics observed from our subjects.
To realize the full potential of gaze, we propose a novel network architecture that enables bidirectional communication between the gaze and motion branches.
arXiv Detail & Related papers (2022-04-20T13:17:39Z) - Dyadic Human Motion Prediction [119.3376964777803]
We introduce a motion prediction framework that explicitly reasons about the interactions of two observed subjects.
Specifically, we achieve this by introducing a pairwise attention mechanism that models the mutual dependencies in the motion history of the two subjects.
This allows us to preserve the long-term motion dynamics in a more realistic way and more robustly predict unusual and fast-paced movements.
arXiv Detail & Related papers (2021-12-01T10:30:40Z) - Scene-aware Generative Network for Human Motion Synthesis [125.21079898942347]
We propose a new framework, with the interaction between the scene and the human motion taken into account.
Considering the uncertainty of human motion, we formulate this task as a generative task.
We derive a GAN based learning approach, with discriminators to enforce the compatibility between the human motion and the contextual scene.
arXiv Detail & Related papers (2021-05-31T09:05:50Z) - Long-term Human Motion Prediction with Scene Context [60.096118270451974]
We propose a novel three-stage framework for predicting human motion.
Our method first samples multiple human motion goals, then plans 3D human paths towards each goal, and finally predicts 3D human pose sequences following each path.
arXiv Detail & Related papers (2020-07-07T17:59:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.