Is attention to bounding boxes all you need for pedestrian action
prediction?
- URL: http://arxiv.org/abs/2107.08031v1
- Date: Fri, 16 Jul 2021 17:47:32 GMT
- Title: Is attention to bounding boxes all you need for pedestrian action
prediction?
- Authors: Lina Achaji, Julien Moreau, Thibault Fouqueray, Francois Aioun,
Francois Charpillet
- Abstract summary: We present a framework based on multiple variations of the Transformer models to reason attentively about the dynamic evolution of the pedestrians' past trajectory.
We prove that using only bounding boxes as input to our model can outperform the previous state-of-the-art models.
Our model has similarly reached high accuracy (91 and F1-score (0.91) on this dataset.
- Score: 1.3999481573773074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The human driver is no longer the only one concerned with the complexity of
the driving scenarios. Autonomous vehicles (AV) are similarly becoming involved
in the process. Nowadays, the development of AV in urban places underpins
essential safety concerns for vulnerable road users (VRUs) such as pedestrians.
Therefore, to make the roads safer, it is critical to classify and predict
their future behavior. In this paper, we present a framework based on multiple
variations of the Transformer models to reason attentively about the dynamic
evolution of the pedestrians' past trajectory and predict its future actions of
crossing or not crossing the street. We proved that using only bounding boxes
as input to our model can outperform the previous state-of-the-art models and
reach a prediction accuracy of 91 % and an F1-score of 0.83 on the PIE dataset
up to two seconds ahead in the future. In addition, we introduced a large-size
simulated dataset (CP2A) using CARLA for action prediction. Our model has
similarly reached high accuracy (91 %) and F1-score (0.91) on this dataset.
Interestingly, we showed that pre-training our Transformer model on the
simulated dataset and then fine-tuning it on the real dataset can be very
effective for the action prediction task.
Related papers
- Planning with Adaptive World Models for Autonomous Driving [50.4439896514353]
Motion planners (MPs) are crucial for safe navigation in complex urban environments.
nuPlan, a recently released MP benchmark, addresses this limitation by augmenting real-world driving logs with closed-loop simulation logic.
We present AdaptiveDriver, a model-predictive control (MPC) based planner that unrolls different world models conditioned on BehaviorNet's predictions.
arXiv Detail & Related papers (2024-06-15T18:53:45Z) - Pre-training on Synthetic Driving Data for Trajectory Prediction [61.520225216107306]
We propose a pipeline-level solution to mitigate the issue of data scarcity in trajectory forecasting.
We adopt HD map augmentation and trajectory synthesis for generating driving data, and then we learn representations by pre-training on them.
We conduct extensive experiments to demonstrate the effectiveness of our data expansion and pre-training strategies.
arXiv Detail & Related papers (2023-09-18T19:49:22Z) - Learning Pedestrian Actions to Ensure Safe Autonomous Driving [12.440017892152417]
It is critical for Autonomous Vehicles to have the ability to predict pedestrians' short-term and immediate actions in real-time.
In this work, a novel multi-task sequence to sequence Transformer encoders-decoders (TF-ed) architecture is proposed for pedestrian action and trajectory prediction.
The proposed approach is compared against an existing LSTM encoders decoders (LSTM-ed) architecture for action and trajectory prediction.
arXiv Detail & Related papers (2023-05-22T14:03:38Z) - AdvDO: Realistic Adversarial Attacks for Trajectory Prediction [87.96767885419423]
Trajectory prediction is essential for autonomous vehicles to plan correct and safe driving behaviors.
We devise an optimization-based adversarial attack framework to generate realistic adversarial trajectories.
Our attack can lead an AV to drive off road or collide into other vehicles in simulation.
arXiv Detail & Related papers (2022-09-19T03:34:59Z) - Transforming Model Prediction for Tracking [109.08417327309937]
Transformers capture global relations with little inductive bias, allowing it to learn the prediction of more powerful target models.
We train the proposed tracker end-to-end and validate its performance by conducting comprehensive experiments on multiple tracking datasets.
Our tracker sets a new state of the art on three benchmarks, achieving an AUC of 68.5% on the challenging LaSOT dataset.
arXiv Detail & Related papers (2022-03-21T17:59:40Z) - PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction
Transformer [0.9786690381850356]
We introduce a model called PRediction Transformer (PReTR) that extracts features from the multi-agent scenes by employing a factorized-temporal attention module.
It shows less computational needs than previously studied models with empirically better results.
We leverage encoder-decoder Transformer networks for parallel decoding a set of learned object queries.
arXiv Detail & Related papers (2022-03-17T12:52:23Z) - Pedestrian Trajectory Prediction via Spatial Interaction Transformer
Network [7.150832716115448]
In traffic scenes, when encountering with oncoming people, pedestrians may make sudden turns or stop immediately.
To predict such unpredictable trajectories, we can gain insights into the interaction between pedestrians.
We present a novel generative method named Spatial Interaction Transformer (SIT), which learns the correlation of pedestrian trajectories through attention mechanisms.
arXiv Detail & Related papers (2021-12-13T13:08:04Z) - Injecting Knowledge in Data-driven Vehicle Trajectory Predictors [82.91398970736391]
Vehicle trajectory prediction tasks have been commonly tackled from two perspectives: knowledge-driven or data-driven.
In this paper, we propose to learn a "Realistic Residual Block" (RRB) which effectively connects these two perspectives.
Our proposed method outputs realistic predictions by confining the residual range and taking into account its uncertainty.
arXiv Detail & Related papers (2021-03-08T16:03:09Z) - PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction
in 3D [10.580548257913843]
We propose a new pedestrian action prediction dataset created by adding per-frame 2D/3D bounding box and behavioral annotations to nuScenes.
In addition, we propose a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action.
arXiv Detail & Related papers (2020-12-14T18:13:44Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.