The MI-Motion Dataset and Benchmark for 3D Multi-Person Motion
Prediction
- URL: http://arxiv.org/abs/2306.13566v2
- Date: Mon, 26 Jun 2023 15:13:31 GMT
- Title: The MI-Motion Dataset and Benchmark for 3D Multi-Person Motion
Prediction
- Authors: Xiaogang Peng, Xiao Zhou, Yikai Luo, Hao Wen, Yu Ding, Zizhao Wu
- Abstract summary: 3D multi-person motion prediction is a challenging task that involves modeling individual behaviors and interactions between people.
We introduce the Multi-Person Interaction Motion (MI-Motion) dataset, which includes skeleton sequences of multiple individuals collected by motion capture systems.
The dataset contains 167k frames of interacting people's skeleton poses and is categorized into 5 different activity scenes.
- Score: 13.177817435234449
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D multi-person motion prediction is a challenging task that involves
modeling individual behaviors and interactions between people. Despite the
emergence of approaches for this task, comparing them is difficult due to the
lack of standardized training settings and benchmark datasets. In this paper,
we introduce the Multi-Person Interaction Motion (MI-Motion) Dataset, which
includes skeleton sequences of multiple individuals collected by motion capture
systems and refined and synthesized using a game engine. The dataset contains
167k frames of interacting people's skeleton poses and is categorized into 5
different activity scenes. To facilitate research in multi-person motion
prediction, we also provide benchmarks to evaluate the performance of
prediction methods in three settings: short-term, long-term, and
ultra-long-term prediction. Additionally, we introduce a novel baseline
approach that leverages graph and temporal convolutional networks, which has
demonstrated competitive results in multi-person motion prediction. We believe
that the proposed MI-Motion benchmark dataset and baseline will facilitate
future research in this area, ultimately leading to better understanding and
modeling of multi-person interactions.
Related papers
- Multi-Transmotion: Pre-trained Model for Human Motion Prediction [68.87010221355223]
Multi-Transmotion is an innovative transformer-based model designed for cross-modality pre-training.
Our methodology demonstrates competitive performance across various datasets on several downstream tasks.
arXiv Detail & Related papers (2024-11-04T23:15:21Z) - Scaling Up Dynamic Human-Scene Interaction Modeling [58.032368564071895]
TRUMANS is the most comprehensive motion-captured HSI dataset currently available.
It intricately captures whole-body human motions and part-level object dynamics.
We devise a diffusion-based autoregressive model that efficiently generates HSI sequences of any length.
arXiv Detail & Related papers (2024-03-13T15:45:04Z) - Stochastic Multi-Person 3D Motion Forecasting [21.915057426589744]
We deal with the ignored real-world complexities in prior work on human motion forecasting.
Our framework is general; we instantiate it with different generative models.
Our approach produces diverse and accurate multi-person predictions, significantly outperforming the state of the art.
arXiv Detail & Related papers (2023-06-08T17:59:09Z) - Mutual Information-Based Temporal Difference Learning for Human Pose
Estimation in Video [16.32910684198013]
We present a novel multi-frame human pose estimation framework, which employs temporal differences across frames to model dynamic contexts.
To be specific, we design a multi-stage entangled learning sequences conditioned on multi-stage differences to derive informative motion representation sequences.
These place us to rank No.1 in the Crowd Pose Estimation in Complex Events Challenge on benchmark HiEve.
arXiv Detail & Related papers (2023-03-15T09:29:03Z) - Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D
Pose Estimation Tracking and Forecasting on a Video Snippet [24.852728097115744]
Multi-person pose understanding from RGB involves three complex tasks: pose estimation, tracking and motion forecasting.
Most existing works either focus on a single task or employ multi-stage approaches to solving multiple tasks separately.
We propose Snipper, a unified framework to perform multi-person 3D pose estimation, tracking, and motion forecasting simultaneously in a single stage.
arXiv Detail & Related papers (2022-07-09T18:42:14Z) - Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images [79.70127290464514]
We decompose the task into two stages, i.e. person localization and pose estimation.
And we propose three task-specific graph neural networks for effective message passing.
Our approach achieves state-of-the-art performance on CMU Panoptic and Shelf datasets.
arXiv Detail & Related papers (2021-09-13T11:44:07Z) - Multi-level Motion Attention for Human Motion Prediction [132.29963836262394]
We study the use of different types of attention, computed at joint, body part, and full pose levels.
Our experiments on Human3.6M, AMASS and 3DPW validate the benefits of our approach for both periodical and non-periodical actions.
arXiv Detail & Related papers (2021-06-17T08:08:11Z) - Large Scale Interactive Motion Forecasting for Autonomous Driving : The
Waymo Open Motion Dataset [84.3946567650148]
With over 100,000 scenes, each 20 seconds long at 10 Hz, our new dataset contains more than 570 hours of unique data over 1750 km of roadways.
We use a high-accuracy 3D auto-labeling system to generate high quality 3D bounding boxes for each road agent.
We introduce a new set of metrics that provides a comprehensive evaluation of both single agent and joint agent interaction motion forecasting models.
arXiv Detail & Related papers (2021-04-20T17:19:05Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - Learning Multiscale Correlations for Human Motion Prediction [10.335804615372629]
We propose a novel multiscale graph convolution network (MGCN) to capture the correlations among human body components.
We evaluate our approach on two standard benchmark datasets for human motion prediction.
arXiv Detail & Related papers (2021-03-19T07:58:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.