Multiscale Spatio-Temporal Graph Neural Networks for 3D Skeleton-Based
Motion Prediction
- URL: http://arxiv.org/abs/2108.11244v1
- Date: Wed, 25 Aug 2021 14:05:37 GMT
- Title: Multiscale Spatio-Temporal Graph Neural Networks for 3D Skeleton-Based
Motion Prediction
- Authors: Maosen Li, Siheng Chen, Yangheng Zhao, Ya Zhang, Yanfeng Wang, Qi Tian
- Abstract summary: We propose a multiscale-temporal graph neural network (MST-GNN) to predict the future 3D-based skeleton human poses.
The MST-GNN outperforms state-of-the-art methods in both short and long-term motion prediction.
- Score: 92.16318571149553
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a multiscale spatio-temporal graph neural network (MST-GNN) to
predict the future 3D skeleton-based human poses in an action-category-agnostic
manner. The core of MST-GNN is a multiscale spatio-temporal graph that
explicitly models the relations in motions at various spatial and temporal
scales. Different from many previous hierarchical structures, our multiscale
spatio-temporal graph is built in a data-adaptive fashion, which captures
nonphysical, yet motion-based relations. The key module of MST-GNN is a
multiscale spatio-temporal graph computational unit (MST-GCU) based on the
trainable graph structure. MST-GCU embeds underlying features at individual
scales and then fuses features across scales to obtain a comprehensive
representation. The overall architecture of MST-GNN follows an encoder-decoder
framework, where the encoder consists of a sequence of MST-GCUs to learn the
spatial and temporal features of motions, and the decoder uses a graph-based
attention gate recurrent unit (GA-GRU) to generate future poses. Extensive
experiments are conducted to show that the proposed MST-GNN outperforms
state-of-the-art methods in both short and long-term motion prediction on the
datasets of Human 3.6M, CMU Mocap and 3DPW, where MST-GNN outperforms previous
works by 5.33% and 3.67% of mean angle errors in average for short-term and
long-term prediction on Human 3.6M, and by 11.84% and 4.71% of mean angle
errors for short-term and long-term prediction on CMU Mocap, and by 1.13% of
mean angle errors on 3DPW in average, respectively. We further investigate the
learned multiscale graphs for interpretability.
Related papers
- STGFormer: Spatio-Temporal GraphFormer for 3D Human Pose Estimation in Video [7.345621536750547]
This paper presents a graph-based framework for 3D human pose estimation in video.
Specifically, we develop a graph-based attention mechanism, integrating graph information directly into the respective attention layers.
We demonstrate that our method achieves significant stateof-the-art performance in 3D human pose estimation.
arXiv Detail & Related papers (2024-07-14T06:45:27Z) - FourierGNN: Rethinking Multivariate Time Series Forecasting from a Pure
Graph Perspective [48.00240550685946]
Current state-of-the-art graph neural network (GNN)-based forecasting methods usually require both graph networks (e.g., GCN) and temporal networks (e.g., LSTM) to capture inter-series (spatial) dynamics and intra-series (temporal) dependencies, respectively.
We propose a novel Fourier Graph Neural Network (FourierGNN) by stacking our proposed Fourier Graph Operator (FGO) to perform matrix multiplications in Fourier space.
Our experiments on seven datasets have demonstrated superior performance with higher efficiency and fewer parameters compared with state-of-the-
arXiv Detail & Related papers (2023-11-10T17:13:26Z) - Multi-Graph Convolution Network for Pose Forecasting [0.8057006406834467]
We propose a novel approach called the multi-graph convolution network (MGCN) for 3D human pose forecasting.
MGCN simultaneously captures spatial and temporal information by introducing an augmented graph for pose sequences.
In our evaluation, MGCN outperforms the state-of-the-art in pose prediction.
arXiv Detail & Related papers (2023-04-11T03:59:43Z) - Back to MLP: A Simple Baseline for Human Motion Prediction [59.18776744541904]
This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences.
We show that the performance of these approaches can be surpassed by a light-weight and purely architectural architecture with only 0.14M parameters.
An exhaustive evaluation on Human3.6M, AMASS and 3DPW datasets shows that our method, which we dub siMLPe, consistently outperforms all other approaches.
arXiv Detail & Related papers (2022-07-04T16:35:58Z) - Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs [65.18780403244178]
We propose a continuous model to forecast Multivariate Time series with dynamic Graph neural Ordinary Differential Equations (MTGODE)
Specifically, we first abstract multivariate time series into dynamic graphs with time-evolving node features and unknown graph structures.
Then, we design and solve a neural ODE to complement missing graph topologies and unify both spatial and temporal message passing.
arXiv Detail & Related papers (2022-02-17T02:17:31Z) - DMS-GCN: Dynamic Mutiscale Spatiotemporal Graph Convolutional Networks
for Human Motion Prediction [8.142947808507365]
We propose a feed-forward deep neural network for motion prediction.
The entire model is suitable for all actions and follows a framework of encoder-decoder.
Our approach outperforms SOTA methods on the datasets of Human3.6M and CMU Mocap.
arXiv Detail & Related papers (2021-12-20T07:07:03Z) - Space-Time Graph Neural Networks [104.55175325870195]
We introduce space-time graph neural network (ST-GNN) to jointly process the underlying space-time topology of time-varying network data.
Our analysis shows that small variations in the network topology and time evolution of a system does not significantly affect the performance of ST-GNNs.
arXiv Detail & Related papers (2021-10-06T16:08:44Z) - Spatio-Temporal Graph Scattering Transform [54.52797775999124]
Graph neural networks may be impractical in some real-world scenarios due to a lack of sufficient high-quality training data.
We put forth a novel mathematically designed framework to analyze-temporal data.
arXiv Detail & Related papers (2020-12-06T19:49:55Z) - Dynamic Multiscale Graph Neural Networks for 3D Skeleton-Based Human
Motion Prediction [102.9787019197379]
We propose novel dynamic multiscale graph neural networks (DMGNN) to predict 3D skeleton-based human motions.
The model is action-category-agnostic and follows an encoder-decoder framework.
The proposed DMGNN outperforms state-of-the-art methods in both short and long-term predictions.
arXiv Detail & Related papers (2020-03-17T02:49:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.