GPRAR: Graph Convolutional Network based Pose Reconstruction and Action
Recognition for Human Trajectory Prediction
- URL: http://arxiv.org/abs/2103.14113v1
- Date: Thu, 25 Mar 2021 20:12:14 GMT
- Title: GPRAR: Graph Convolutional Network based Pose Reconstruction and Action
Recognition for Human Trajectory Prediction
- Authors: Manh Huynh, Gita Alaghband
- Abstract summary: Existing prediction models are easily prone to errors in real-world settings where observations are often noisy.
We introduce GPRAR, a graph convolutional network based pose reconstruction and action recognition for human trajectory prediction.
We show that GPRAR improves the prediction accuracy up to 22% and 50% under noisy observations on JAAD and TITAN datasets.
- Score: 1.2891210250935146
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Prediction with high accuracy is essential for various applications such as
autonomous driving. Existing prediction models are easily prone to errors in
real-world settings where observations (e.g. human poses and locations) are
often noisy. To address this problem, we introduce GPRAR, a graph convolutional
network based pose reconstruction and action recognition for human trajectory
prediction. The key idea of GPRAR is to generate robust features: human poses
and actions, under noisy scenarios. To this end, we design GPRAR using two
novel sub-networks: PRAR (Pose Reconstruction and Action Recognition) and FA
(Feature Aggregator). PRAR aims to simultaneously reconstruct human poses and
action features from the coherent and structural properties of human skeletons.
It is a network of an encoder and two decoders, each of which comprises
multiple layers of spatiotemporal graph convolutional networks. Moreover, we
propose a Feature Aggregator (FA) to channel-wise aggregate the learned
features: human poses, actions, locations, and camera motion using
encoder-decoder based temporal convolutional neural networks to predict future
locations. Extensive experiments on the commonly used datasets: JAAD [13] and
TITAN [19] show accuracy improvements of GPRAR over state-of-theart models.
Specifically, GPRAR improves the prediction accuracy up to 22% and 50% under
noisy observations on JAAD and TITAN datasets, respectively
Related papers
- Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior.
Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z) - BTranspose: Bottleneck Transformers for Human Pose Estimation with
Self-Supervised Pre-Training [0.304585143845864]
In this paper, we consider the recently proposed Bottleneck Transformers, which combine CNN and multi-head self attention (MHSA) layers effectively.
We consider different backbone architectures and pre-train them using the DINO self-supervised learning method.
Experiments show that our model achieves an AP of 76.4, which is competitive with other methods such as [1] and has fewer network parameters.
arXiv Detail & Related papers (2022-04-21T15:45:05Z) - Investigating Pose Representations and Motion Contexts Modeling for 3D
Motion Prediction [63.62263239934777]
We conduct an indepth study on various pose representations with a focus on their effects on the motion prediction task.
We propose a novel RNN architecture termed AHMR (Attentive Hierarchical Motion Recurrent network) for motion prediction.
Our approach outperforms the state-of-the-art methods in short-term prediction and achieves much enhanced long-term prediction proficiency.
arXiv Detail & Related papers (2021-12-30T10:45:22Z) - A Variational Graph Autoencoder for Manipulation Action Recognition and
Prediction [1.1816942730023883]
We introduce a deep graph autoencoder to jointly learn recognition and prediction of manipulation tasks from symbolic scene graphs.
Our network has a variational autoencoder structure with two branches: one for identifying the input graph type and one for predicting the future graphs.
We benchmark our new model against different state-of-the-art methods on two different datasets, MANIAC and MSRC-9, and show that our proposed model can achieve better performance.
arXiv Detail & Related papers (2021-10-25T21:40:42Z) - MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human
Motion Prediction [34.565986275769745]
We propose a novel Multi-Scale Residual Graph Convolution Network (MSR-GCN) for human pose prediction task.
Our proposed approach is evaluated on two standard benchmark datasets, i.e., the Human3.6M dataset and the CMU Mocap dataset.
arXiv Detail & Related papers (2021-08-16T15:26:23Z) - Development of Human Motion Prediction Strategy using Inception Residual
Block [1.0705399532413613]
We propose an Inception Residual Block (IRB) to detect temporal features in human poses.
Our main contribution is to propose a residual connection between input and the output of the inception block to have a continuity between the previously observed pose and the next predicted pose.
With this proposed architecture, it learns prior knowledge much better about human poses and we achieve much higher prediction accuracy as detailed in the paper.
arXiv Detail & Related papers (2021-08-09T12:49:48Z) - An Adversarial Human Pose Estimation Network Injected with Graph
Structure [75.08618278188209]
In this paper, we design a novel generative adversarial network (GAN) to improve the localization accuracy of visible joints when some joints are invisible.
The network consists of two simple but efficient modules, Cascade Feature Network (CFN) and Graph Structure Network (GSN)
arXiv Detail & Related papers (2021-03-29T12:07:08Z) - Spatio-Temporal Inception Graph Convolutional Networks for
Skeleton-Based Action Recognition [126.51241919472356]
We design a simple and highly modularized graph convolutional network architecture for skeleton-based action recognition.
Our network is constructed by repeating a building block that aggregates multi-granularity information from both the spatial and temporal paths.
arXiv Detail & Related papers (2020-11-26T14:43:04Z) - Dynamic Multiscale Graph Neural Networks for 3D Skeleton-Based Human
Motion Prediction [102.9787019197379]
We propose novel dynamic multiscale graph neural networks (DMGNN) to predict 3D skeleton-based human motions.
The model is action-category-agnostic and follows an encoder-decoder framework.
The proposed DMGNN outperforms state-of-the-art methods in both short and long-term predictions.
arXiv Detail & Related papers (2020-03-17T02:49:51Z) - Anatomy-aware 3D Human Pose Estimation with Bone-based Pose
Decomposition [92.99291528676021]
Instead of directly regressing the 3D joint locations, we decompose the task into bone direction prediction and bone length prediction.
Our motivation is the fact that the bone lengths of a human skeleton remain consistent across time.
Our full model outperforms the previous best results on Human3.6M and MPI-INF-3DHP datasets.
arXiv Detail & Related papers (2020-02-24T15:49:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.