Under the Hood of Transformer Networks for Trajectory Forecasting
- URL: http://arxiv.org/abs/2203.11878v1
- Date: Tue, 22 Mar 2022 16:56:05 GMT
- Title: Under the Hood of Transformer Networks for Trajectory Forecasting
- Authors: Luca Franco, Leonardo Placidi, Francesco Giuliari, Irtiza Hasan, Marco
Cristani, Fabio Galasso
- Abstract summary: Transformer Networks have established themselves as the de-facto state-of-the-art for trajectory forecasting.
This paper proposes the first in-depth study of Transformer Networks (TF) and Bidirectional Transformers (BERT) for the forecasting of the individual motion of people.
- Score: 11.001055546731623
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer Networks have established themselves as the de-facto
state-of-the-art for trajectory forecasting but there is currently no
systematic study on their capability to model the motion patterns of people,
without interactions with other individuals nor the social context. This paper
proposes the first in-depth study of Transformer Networks (TF) and
Bidirectional Transformers (BERT) for the forecasting of the individual motion
of people, without bells and whistles. We conduct an exhaustive evaluation of
input/output representations, problem formulations and sequence modeling,
including a novel analysis of their capability to predict multi-modal futures.
Out of comparative evaluation on the ETH+UCY benchmark, both TF and BERT are
top performers in predicting individual motions, definitely overcoming RNNs and
LSTMs. Furthermore, they remain within a narrow margin wrt more complex
techniques, which include both social interactions and scene contexts. Source
code will be released for all conducted experiments.
Related papers
- Multi-Transmotion: Pre-trained Model for Human Motion Prediction [68.87010221355223]
Multi-Transmotion is an innovative transformer-based model designed for cross-modality pre-training.
Our methodology demonstrates competitive performance across various datasets on several downstream tasks.
arXiv Detail & Related papers (2024-11-04T23:15:21Z) - Sparse Prototype Network for Explainable Pedestrian Behavior Prediction [60.80524827122901]
We present Sparse Prototype Network (SPN), an explainable method designed to simultaneously predict a pedestrian's future action, trajectory, and pose.
Regularized by mono-semanticity and clustering constraints, the prototypes learn consistent and human-understandable features.
arXiv Detail & Related papers (2024-10-16T03:33:40Z) - In-Context Convergence of Transformers [63.04956160537308]
We study the learning dynamics of a one-layer transformer with softmax attention trained via gradient descent.
For data with imbalanced features, we show that the learning dynamics take a stage-wise convergence process.
arXiv Detail & Related papers (2023-10-08T17:55:33Z) - Optimizing Non-Autoregressive Transformers with Contrastive Learning [74.46714706658517]
Non-autoregressive Transformers (NATs) reduce the inference latency of Autoregressive Transformers (ATs) by predicting words all at once rather than in sequential order.
In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.
arXiv Detail & Related papers (2023-05-23T04:20:13Z) - Safety-compliant Generative Adversarial Networks for Human Trajectory
Forecasting [95.82600221180415]
Human forecasting in crowds presents the challenges of modelling social interactions and outputting collision-free multimodal distribution.
We introduce SGANv2, an improved safety-compliant SGAN architecture equipped with motion-temporal interaction modelling and a transformer-based discriminator design.
arXiv Detail & Related papers (2022-09-25T15:18:56Z) - Back to MLP: A Simple Baseline for Human Motion Prediction [59.18776744541904]
This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences.
We show that the performance of these approaches can be surpassed by a light-weight and purely architectural architecture with only 0.14M parameters.
An exhaustive evaluation on Human3.6M, AMASS and 3DPW datasets shows that our method, which we dub siMLPe, consistently outperforms all other approaches.
arXiv Detail & Related papers (2022-07-04T16:35:58Z) - SFMGNet: A Physics-based Neural Network To Predict Pedestrian
Trajectories [2.862893981836593]
We present a physics-based neural network to predict pedestrian trajectories.
We quantitatively and qualitatively evaluate the model with respect to realistic prediction, prediction performance and prediction "interpretability"
Initial results suggest, the model even when solely trained on a synthetic dataset, can predict realistic and interpretable trajectories with better than state-of-the-art accuracy.
arXiv Detail & Related papers (2022-02-06T14:58:09Z) - AC-VRNN: Attentive Conditional-VRNN for Multi-Future Trajectory
Prediction [30.61190086847564]
We propose a generative architecture for multi-future trajectory predictions based on Conditional Variational Recurrent Neural Networks (C-VRNNs)
Human interactions are modeled with a graph-based attention mechanism enabling an online attentive hidden state refinement of the recurrent estimation.
arXiv Detail & Related papers (2020-05-17T17:21:23Z) - Transformer Networks for Trajectory Forecasting [11.802437934289062]
We propose the novel use of Transformer Networks for trajectory forecasting.
This is a fundamental switch from the sequential step-by-step processing of LSTMs to the only-attention-based memory mechanisms of Transformers.
arXiv Detail & Related papers (2020-03-18T09:17:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.