Related papers: Under the Hood of Transformer Networks for Trajectory Forecasting

Under the Hood of Transformer Networks for Trajectory Forecasting

URL: http://arxiv.org/abs/2203.11878v1
Date: Tue, 22 Mar 2022 16:56:05 GMT
Title: Under the Hood of Transformer Networks for Trajectory Forecasting
Authors: Luca Franco, Leonardo Placidi, Francesco Giuliari, Irtiza Hasan, Marco Cristani, Fabio Galasso
Abstract summary: Transformer Networks have established themselves as the de-facto state-of-the-art for trajectory forecasting. This paper proposes the first in-depth study of Transformer Networks (TF) and Bidirectional Transformers (BERT) for the forecasting of the individual motion of people.
Score: 11.001055546731623
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transformer Networks have established themselves as the de-facto state-of-the-art for trajectory forecasting but there is currently no systematic study on their capability to model the motion patterns of people, without interactions with other individuals nor the social context. This paper proposes the first in-depth study of Transformer Networks (TF) and Bidirectional Transformers (BERT) for the forecasting of the individual motion of people, without bells and whistles. We conduct an exhaustive evaluation of input/output representations, problem formulations and sequence modeling, including a novel analysis of their capability to predict multi-modal futures. Out of comparative evaluation on the ETH+UCY benchmark, both TF and BERT are top performers in predicting individual motions, definitely overcoming RNNs and LSTMs. Furthermore, they remain within a narrow margin wrt more complex techniques, which include both social interactions and scene contexts. Source code will be released for all conducted experiments.

Related papers

Multi-Transmotion: Pre-trained Model for Human Motion Prediction [68.87010221355223]
Multi-Transmotion is an innovative transformer-based model designed for cross-modality pre-training. Our methodology demonstrates competitive performance across various datasets on several downstream tasks.
arXiv Detail & Related papers (2024-11-04T23:15:21Z)
Sparse Prototype Network for Explainable Pedestrian Behavior Prediction [60.80524827122901]
We present Sparse Prototype Network (SPN), an explainable method designed to simultaneously predict a pedestrian's future action, trajectory, and pose. Regularized by mono-semanticity and clustering constraints, the prototypes learn consistent and human-understandable features.
arXiv Detail & Related papers (2024-10-16T03:33:40Z)
In-Context Convergence of Transformers [63.04956160537308]
We study the learning dynamics of a one-layer transformer with softmax attention trained via gradient descent. For data with imbalanced features, we show that the learning dynamics take a stage-wise convergence process.
arXiv Detail & Related papers (2023-10-08T17:55:33Z)
Optimizing Non-Autoregressive Transformers with Contrastive Learning [74.46714706658517]
Non-autoregressive Transformers (NATs) reduce the inference latency of Autoregressive Transformers (ATs) by predicting words all at once rather than in sequential order. In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.
arXiv Detail & Related papers (2023-05-23T04:20:13Z)
Safety-compliant Generative Adversarial Networks for Human Trajectory Forecasting [95.82600221180415]
Human forecasting in crowds presents the challenges of modelling social interactions and outputting collision-free multimodal distribution. We introduce SGANv2, an improved safety-compliant SGAN architecture equipped with motion-temporal interaction modelling and a transformer-based discriminator design.
arXiv Detail & Related papers (2022-09-25T15:18:56Z)
Back to MLP: A Simple Baseline for Human Motion Prediction [59.18776744541904]
This paper tackles the problem of human motion prediction, consisting in forecasting future body poses from historically observed sequences. We show that the performance of these approaches can be surpassed by a light-weight and purely architectural architecture with only 0.14M parameters. An exhaustive evaluation on Human3.6M, AMASS and 3DPW datasets shows that our method, which we dub siMLPe, consistently outperforms all other approaches.
arXiv Detail & Related papers (2022-07-04T16:35:58Z)
SFMGNet: A Physics-based Neural Network To Predict Pedestrian Trajectories [2.862893981836593]
We present a physics-based neural network to predict pedestrian trajectories. We quantitatively and qualitatively evaluate the model with respect to realistic prediction, prediction performance and prediction "interpretability" Initial results suggest, the model even when solely trained on a synthetic dataset, can predict realistic and interpretable trajectories with better than state-of-the-art accuracy.
arXiv Detail & Related papers (2022-02-06T14:58:09Z)
AC-VRNN: Attentive Conditional-VRNN for Multi-Future Trajectory Prediction [30.61190086847564]
We propose a generative architecture for multi-future trajectory predictions based on Conditional Variational Recurrent Neural Networks (C-VRNNs) Human interactions are modeled with a graph-based attention mechanism enabling an online attentive hidden state refinement of the recurrent estimation.
arXiv Detail & Related papers (2020-05-17T17:21:23Z)
Transformer Networks for Trajectory Forecasting [11.802437934289062]
We propose the novel use of Transformer Networks for trajectory forecasting. This is a fundamental switch from the sequential step-by-step processing of LSTMs to the only-attention-based memory mechanisms of Transformers.
arXiv Detail & Related papers (2020-03-18T09:17:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.