Multi-Person Extreme Motion Prediction with Cross-Interaction Attention
- URL: http://arxiv.org/abs/2105.08825v2
- Date: Thu, 20 May 2021 11:46:15 GMT
- Title: Multi-Person Extreme Motion Prediction with Cross-Interaction Attention
- Authors: Wen Guo, Xiaoyu Bie, Xavier Alameda-Pineda, Francesc Moreno-Noguer
- Abstract summary: Human motion prediction aims to forecast future human poses given a sequence of past 3D skeletons.
We assume that the input of our system are two sequences of past skeletons for two interacting persons.
We devise a novel cross interaction attention mechanism that learns to predict cross dependencies between self poses and the poses of the other person.
- Score: 44.35977105396732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human motion prediction aims to forecast future human poses given a sequence
of past 3D skeletons. While this problem has recently received increasing
attention, it has mostly been tackled for single humans in isolation. In this
paper we explore this problem from a novel perspective, involving humans
performing collaborative tasks. We assume that the input of our system are two
sequences of past skeletons for two interacting persons, and we aim to predict
the future motion for each of them. For this purpose, we devise a novel cross
interaction attention mechanism that exploits historical information of both
persons and learns to predict cross dependencies between self poses and the
poses of the other person in spite of their spatial or temporal distance. Since
no dataset to train such interactive situations is available, we have captured
ExPI (Extreme Pose Interaction), a new lab-based person interaction dataset of
professional dancers performing acrobatics. ExPI contains 115 sequences with
30k frames and 60k instances with annotated 3D body poses and shapes. We
thoroughly evaluate our cross-interaction network on this dataset and show that
both in short-term and long-term predictions, it consistently outperforms
baselines that independently reason for each person. We plan to release our
code jointly with the dataset and the train/test splits to spur future research
on the topic.
Related papers
- Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration.
In this work, we tackle the task of reconstructing closely interactive humans from a monocular video.
We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z) - Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior.
Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z) - The MI-Motion Dataset and Benchmark for 3D Multi-Person Motion
Prediction [13.177817435234449]
3D multi-person motion prediction is a challenging task that involves modeling individual behaviors and interactions between people.
We introduce the Multi-Person Interaction Motion (MI-Motion) dataset, which includes skeleton sequences of multiple individuals collected by motion capture systems.
The dataset contains 167k frames of interacting people's skeleton poses and is categorized into 5 different activity scenes.
arXiv Detail & Related papers (2023-06-23T15:38:22Z) - Multi-Graph Convolution Network for Pose Forecasting [0.8057006406834467]
We propose a novel approach called the multi-graph convolution network (MGCN) for 3D human pose forecasting.
MGCN simultaneously captures spatial and temporal information by introducing an augmented graph for pose sequences.
In our evaluation, MGCN outperforms the state-of-the-art in pose prediction.
arXiv Detail & Related papers (2023-04-11T03:59:43Z) - Multi-level Motion Attention for Human Motion Prediction [132.29963836262394]
We study the use of different types of attention, computed at joint, body part, and full pose levels.
Our experiments on Human3.6M, AMASS and 3DPW validate the benefits of our approach for both periodical and non-periodical actions.
arXiv Detail & Related papers (2021-06-17T08:08:11Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - History Repeats Itself: Human Motion Prediction via Motion Attention [81.94175022575966]
We introduce an attention-based feed-forward network that explicitly leverages the observation that human motion tends to repeat itself.
In particular, we propose to extract motion attention to capture the similarity between the current motion context and the historical motion sub-sequences.
Our experiments on Human3.6M, AMASS and 3DPW evidence the benefits of our approach for both periodical and non-periodical actions.
arXiv Detail & Related papers (2020-07-23T02:12:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.