Space-Time-Separable Graph Convolutional Network for Pose Forecasting
- URL: http://arxiv.org/abs/2110.04573v1
- Date: Sat, 9 Oct 2021 13:59:30 GMT
- Title: Space-Time-Separable Graph Convolutional Network for Pose Forecasting
- Authors: Theodoros Sofianos, Alessio Sampieri, Luca Franco and Fabio Galasso
- Abstract summary: STS-GCN models the human pose dynamics only with a graph convolutional network (GCN)
The space-time graph connectivity is factored into space and time affinity, which bottlenecks the space-time cross-talk, while enabling full joint-joint and time-time correlations.
- Score: 3.6417475195085602
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human pose forecasting is a complex structured-data sequence-modelling task,
which has received increasing attention, also due to numerous potential
applications. Research has mainly addressed the temporal dimension as time
series and the interaction of human body joints with a kinematic tree or by a
graph. This has decoupled the two aspects and leveraged progress from the
relevant fields, but it has also limited the understanding of the complex
structural joint spatio-temporal dynamics of the human pose. Here we propose a
novel Space-Time-Separable Graph Convolutional Network (STS-GCN) for pose
forecasting. For the first time, STS-GCN models the human pose dynamics only
with a graph convolutional network (GCN), including the temporal evolution and
the spatial joint interaction within a single-graph framework, which allows the
cross-talk of motion and spatial correlations. Concurrently, STS-GCN is the
first space-time-separable GCN: the space-time graph connectivity is factored
into space and time affinity matrices, which bottlenecks the space-time
cross-talk, while enabling full joint-joint and time-time correlations. Both
affinity matrices are learnt end-to-end, which results in connections
substantially deviating from the standard kinematic tree and the linear-time
time series. In experimental evaluation on three complex, recent and
large-scale benchmarks, Human3.6M [Ionescu et al. TPAMI'14], AMASS [Mahmood et
al. ICCV'19] and 3DPW [Von Marcard et al. ECCV'18], STS-GCN outperforms the
state-of-the-art, surpassing the current best technique [Mao et al. ECCV'20] by
over 32% in average at the most difficult long-term predictions, while only
requiring 1.7% of its parameters. We explain the results qualitatively and
illustrate the graph interactions by the factored joint-joint and time-time
learnt graph connections.
Our source code is available at: https://github.com/FraLuca/STSGCN
Related papers
- Mending of Spatio-Temporal Dependencies in Block Adjacency Matrix [3.529869282529924]
We propose a novel end-to-end learning architecture designed to mend the temporal dependencies, resulting in a well-connected graph.
Our methodology demonstrates superior performance on benchmark datasets, such as SurgVisDom and C2D2.
arXiv Detail & Related papers (2023-10-04T06:42:33Z) - Multi-Graph Convolution Network for Pose Forecasting [0.8057006406834467]
We propose a novel approach called the multi-graph convolution network (MGCN) for 3D human pose forecasting.
MGCN simultaneously captures spatial and temporal information by introducing an augmented graph for pose sequences.
In our evaluation, MGCN outperforms the state-of-the-art in pose prediction.
arXiv Detail & Related papers (2023-04-11T03:59:43Z) - Spatial-Temporal Gating-Adjacency GCN for Human Motion Prediction [14.42671575251554]
We propose the Spatial-Temporal Gating-Adjacency GCN to learn the complex spatial-temporal dependencies over diverse action types.
GAGCN achieves state-of-the-art performance in both short-term and long-term predictions.
arXiv Detail & Related papers (2022-03-03T01:20:24Z) - Multivariate Time Series Forecasting with Dynamic Graph Neural ODEs [65.18780403244178]
We propose a continuous model to forecast Multivariate Time series with dynamic Graph neural Ordinary Differential Equations (MTGODE)
Specifically, we first abstract multivariate time series into dynamic graphs with time-evolving node features and unknown graph structures.
Then, we design and solve a neural ODE to complement missing graph topologies and unify both spatial and temporal message passing.
arXiv Detail & Related papers (2022-02-17T02:17:31Z) - Spatio-Temporal Joint Graph Convolutional Networks for Traffic
Forecasting [75.10017445699532]
Recent have shifted their focus towards formulating traffic forecasting as atemporal graph modeling problem.
We propose a novel approach for accurate traffic forecasting on road networks over multiple future time steps.
arXiv Detail & Related papers (2021-11-25T08:45:14Z) - Multiscale Spatio-Temporal Graph Neural Networks for 3D Skeleton-Based
Motion Prediction [92.16318571149553]
We propose a multiscale-temporal graph neural network (MST-GNN) to predict the future 3D-based skeleton human poses.
The MST-GNN outperforms state-of-the-art methods in both short and long-term motion prediction.
arXiv Detail & Related papers (2021-08-25T14:05:37Z) - Spatial-Temporal Correlation and Topology Learning for Person
Re-Identification in Videos [78.45050529204701]
We propose a novel framework to pursue discriminative and robust representation by modeling cross-scale spatial-temporal correlation.
CTL utilizes a CNN backbone and a key-points estimator to extract semantic local features from human body.
It explores a context-reinforced topology to construct multi-scale graphs by considering both global contextual information and physical connections of human body.
arXiv Detail & Related papers (2021-04-15T14:32:12Z) - On the spatial attention in Spatio-Temporal Graph Convolutional Networks
for skeleton-based human action recognition [97.14064057840089]
Graphal networks (GCNs) promising performance in skeleton-based human action recognition by modeling a sequence of skeletons as a graph.
Most of the recently proposed G-temporal-based methods improve the performance by learning the graph structure at each layer of the network.
arXiv Detail & Related papers (2020-11-07T19:03:04Z) - Disentangling and Unifying Graph Convolutions for Skeleton-Based Action
Recognition [79.33539539956186]
We propose a simple method to disentangle multi-scale graph convolutions and a unified spatial-temporal graph convolutional operator named G3D.
By coupling these proposals, we develop a powerful feature extractor named MS-G3D based on which our model outperforms previous state-of-the-art methods on three large-scale datasets.
arXiv Detail & Related papers (2020-03-31T11:28:25Z) - A Graph Attention Spatio-temporal Convolutional Network for 3D Human
Pose Estimation in Video [7.647599484103065]
We improve the learning of constraints in human skeleton by modeling local global spatial information via attention mechanisms.
Our approach effectively mitigates depth ambiguity and self-occlusion, generalizes to half upper body estimation, and achieves competitive performance on 2D-to-3D video pose estimation.
arXiv Detail & Related papers (2020-03-11T14:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.