Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge
for Human Motion Prediction
- URL: http://arxiv.org/abs/2208.01302v1
- Date: Tue, 2 Aug 2022 08:13:43 GMT
- Title: Overlooked Poses Actually Make Sense: Distilling Privileged Knowledge
for Human Motion Prediction
- Authors: Xiaoning Sun, Qiongjie Cui, Huaijiang Sun, Bin Li, Weiqing Li and
Jianfeng Lu
- Abstract summary: Previous works on human motion prediction follow the pattern of building a mapping relation between the sequence observed and the one to be predicted.
We present a new prediction pattern, which introduces previously overlooked human poses, to implement the prediction task.
These poses exist after the predicted sequence, and form the privileged sequence.
- Score: 26.25110973770013
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous works on human motion prediction follow the pattern of building a
mapping relation between the sequence observed and the one to be predicted.
However, due to the inherent complexity of multivariate time series data, it
still remains a challenge to find the extrapolation relation between motion
sequences. In this paper, we present a new prediction pattern, which introduces
previously overlooked human poses, to implement the prediction task from the
view of interpolation. These poses exist after the predicted sequence, and form
the privileged sequence. To be specific, we first propose an InTerPolation
learning Network (ITP-Network) that encodes both the observed sequence and the
privileged sequence to interpolate the in-between predicted sequence, wherein
the embedded Privileged-sequence-Encoder (Priv-Encoder) learns the privileged
knowledge (PK) simultaneously. Then, we propose a Final Prediction Network
(FP-Network) for which the privileged sequence is not observable, but is
equipped with a novel PK-Simulator that distills PK learned from the previous
network. This simulator takes as input the observed sequence, but approximates
the behavior of Priv-Encoder, enabling FP-Network to imitate the interpolation
process. Extensive experimental results demonstrate that our prediction pattern
achieves state-of-the-art performance on benchmarked H3.6M, CMU-Mocap and 3DPW
datasets in both short-term and long-term predictions.
Related papers
- Sparse Prototype Network for Explainable Pedestrian Behavior Prediction [60.80524827122901]
We present Sparse Prototype Network (SPN), an explainable method designed to simultaneously predict a pedestrian's future action, trajectory, and pose.
Regularized by mono-semanticity and clustering constraints, the prototypes learn consistent and human-understandable features.
arXiv Detail & Related papers (2024-10-16T03:33:40Z) - AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous Driving [59.94343412438211]
We introduce the GPT style next token motion prediction into motion prediction.
Different from language data which is composed of homogeneous units -words, the elements in the driving scene could have complex spatial-temporal and semantic relations.
We propose to adopt three factorized attention modules with different neighbors for information aggregation and different position encoding styles to capture their relations.
arXiv Detail & Related papers (2024-03-20T06:22:37Z) - DeFeeNet: Consecutive 3D Human Motion Prediction with Deviation Feedback [23.687223152464988]
We propose DeFeeNet, a simple yet effective network that can be added on existing one-off prediction models.
We show that our proposed network improves consecutive human motion prediction performance regardless of the basic model.
arXiv Detail & Related papers (2023-04-10T10:18:23Z) - Towards Out-of-Distribution Sequential Event Prediction: A Causal
Treatment [72.50906475214457]
The goal of sequential event prediction is to estimate the next event based on a sequence of historical events.
In practice, the next-event prediction models are trained with sequential data collected at one time.
We propose a framework with hierarchical branching structures for learning context-specific representations.
arXiv Detail & Related papers (2022-10-24T07:54:13Z) - Pose Transformers (POTR): Human Motion Prediction with
Non-Autoregressive Transformers [24.36592204215444]
We propose to leverage Transformer architectures for non-autoregressive human motion prediction.
Our approach decodes elements in parallel from a query sequence, instead of conditioning on previous predictions.
We show that despite its simplicity, our approach achieves competitive results in two public datasets.
arXiv Detail & Related papers (2021-09-15T18:55:15Z) - Aligned Contrastive Predictive Coding [10.521845940927163]
We investigate the possibility of forcing a self-supervised model trained using a contrastive predictive loss to extract slowly varying latent representations.
Rather than producing individual predictions for each of the future representations, the model emits a sequence of predictions shorter than that of the upcoming representations to which they will be aligned.
arXiv Detail & Related papers (2021-04-24T13:07:22Z) - Predicting Temporal Sets with Deep Neural Networks [50.53727580527024]
We propose an integrated solution based on the deep neural networks for temporal sets prediction.
A unique perspective is to learn element relationship by constructing set-level co-occurrence graph.
We design an attention-based module to adaptively learn the temporal dependency of elements and sets.
arXiv Detail & Related papers (2020-06-20T03:29:02Z) - TTPP: Temporal Transformer with Progressive Prediction for Efficient
Action Anticipation [46.28067541184604]
Video action anticipation aims to predict future action categories from observed frames.
Current state-of-the-art approaches mainly resort to recurrent neural networks to encode history information into hidden states.
This paper proposes a simple yet efficient Temporal Transformer with Progressive Prediction framework.
arXiv Detail & Related papers (2020-03-07T07:59:42Z) - ProphetNet: Predicting Future N-gram for Sequence-to-Sequence
Pre-training [85.35910219651572]
We present a new sequence-to-sequence pre-training model called ProphetNet.
It introduces a novel self-supervised objective named future n-gram prediction.
We conduct experiments on CNN/DailyMail, Gigaword, and SQuAD 1.1 benchmarks for abstractive summarization and question generation tasks.
arXiv Detail & Related papers (2020-01-13T05:12:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.