Development of Human Motion Prediction Strategy using Inception Residual
Block
- URL: http://arxiv.org/abs/2108.04001v1
- Date: Mon, 9 Aug 2021 12:49:48 GMT
- Title: Development of Human Motion Prediction Strategy using Inception Residual
Block
- Authors: Shekhar Gupta, Gaurav Kumar Yadav, G. C. Nandi
- Abstract summary: We propose an Inception Residual Block (IRB) to detect temporal features in human poses.
Our main contribution is to propose a residual connection between input and the output of the inception block to have a continuity between the previously observed pose and the next predicted pose.
With this proposed architecture, it learns prior knowledge much better about human poses and we achieve much higher prediction accuracy as detailed in the paper.
- Score: 1.0705399532413613
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human Motion Prediction is a crucial task in computer vision and robotics. It
has versatile application potentials such as in the area of human-robot
interactions, human action tracking for airport security systems, autonomous
car navigation, computer gaming to name a few. However, predicting human motion
based on past actions is an extremely challenging task due to the difficulties
in detecting spatial and temporal features correctly. To detect temporal
features in human poses, we propose an Inception Residual Block(IRB), due to
its inherent capability of processing multiple kernels to capture salient
features. Here, we propose to use multiple 1-D Convolution Neural Network (CNN)
with different kernel sizes and input sequence lengths and concatenate them to
get proper embedding. As kernels strides over different receptive fields, they
detect smaller and bigger salient features at multiple temporal scales. Our
main contribution is to propose a residual connection between input and the
output of the inception block to have a continuity between the previously
observed pose and the next predicted pose. With this proposed architecture, it
learns prior knowledge much better about human poses and we achieve much higher
prediction accuracy as detailed in the paper. Subsequently, we further propose
to feed the output of the inception residual block as an input to the Graph
Convolution Neural Network (GCN) due to its better spatial feature learning
capability. We perform a parametric analysis for better designing of our model
and subsequently, we evaluate our approach on the Human 3.6M dataset and
compare our short-term as well as long-term predictions with the state of the
art papers, where our model outperforms most of the pose results, the detailed
reasons of which have been elaborated in the paper.
Related papers
- Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption [64.07607726562841]
Existing multi-person human reconstruction approaches mainly focus on recovering accurate poses or avoiding penetration.
In this work, we tackle the task of reconstructing closely interactive humans from a monocular video.
We propose to leverage knowledge from proxemic behavior and physics to compensate the lack of visual information.
arXiv Detail & Related papers (2024-04-17T11:55:45Z) - Social-Transmotion: Promptable Human Trajectory Prediction [65.80068316170613]
Social-Transmotion is a generic Transformer-based model that exploits diverse and numerous visual cues to predict human behavior.
Our approach is validated on multiple datasets, including JTA, JRDB, Pedestrians and Cyclists in Road Traffic, and ETH-UCY.
arXiv Detail & Related papers (2023-12-26T18:56:49Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - Using Features at Multiple Temporal and Spatial Resolutions to Predict
Human Behavior in Real Time [2.955419572714387]
We present an approach for integrating high and low-resolution spatial and temporal information to predict human behavior in real time.
Our model composes neural networks for high and low-resolution feature extraction with a neural network for behavior prediction, with all three networks trained simultaneously.
arXiv Detail & Related papers (2022-11-12T18:41:33Z) - Investigating Pose Representations and Motion Contexts Modeling for 3D
Motion Prediction [63.62263239934777]
We conduct an indepth study on various pose representations with a focus on their effects on the motion prediction task.
We propose a novel RNN architecture termed AHMR (Attentive Hierarchical Motion Recurrent network) for motion prediction.
Our approach outperforms the state-of-the-art methods in short-term prediction and achieves much enhanced long-term prediction proficiency.
arXiv Detail & Related papers (2021-12-30T10:45:22Z) - Learning to Predict Diverse Human Motions from a Single Image via
Mixture Density Networks [9.06677862854201]
We propose a novel approach to predict future human motions from a single image, with mixture density networks (MDN) modeling.
Contrary to most existing deep human motion prediction approaches, the multimodal nature of MDN enables the generation of diverse future motion hypotheses.
Our trained model directly takes an image as input and generates multiple plausible motions that satisfy the given condition.
arXiv Detail & Related papers (2021-09-13T08:49:33Z) - TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [77.59069361196404]
TRiPOD is a novel method for predicting body dynamics based on graph attentional networks.
To incorporate a real-world challenge, we learn an indicator representing whether an estimated body joint is visible/invisible at each frame.
Our evaluation shows that TRiPOD outperforms all prior work and state-of-the-art specifically designed for each of the trajectory and pose forecasting tasks.
arXiv Detail & Related papers (2021-04-08T20:01:00Z) - 3D Human motion anticipation and classification [8.069283749930594]
We propose a novel sequence-to-sequence model for human motion prediction and feature learning.
Our model learns to predict multiple future sequences of human poses from the same input sequence.
We show that it takes less than half the number of epochs to train an activity recognition network by using the feature learned from the discriminator.
arXiv Detail & Related papers (2020-12-31T00:19:39Z) - Motion Prediction Using Temporal Inception Module [96.76721173517895]
We propose a Temporal Inception Module (TIM) to encode human motion.
Our framework produces input embeddings using convolutional layers, by using different kernel sizes for different input lengths.
The experimental results on standard motion prediction benchmark datasets Human3.6M and CMU motion capture dataset show that our approach consistently outperforms the state of the art methods.
arXiv Detail & Related papers (2020-10-06T20:26:01Z) - AC-VRNN: Attentive Conditional-VRNN for Multi-Future Trajectory
Prediction [30.61190086847564]
We propose a generative architecture for multi-future trajectory predictions based on Conditional Variational Recurrent Neural Networks (C-VRNNs)
Human interactions are modeled with a graph-based attention mechanism enabling an online attentive hidden state refinement of the recurrent estimation.
arXiv Detail & Related papers (2020-05-17T17:21:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.