TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions
and U-GRUs for skeletal pedestrian crossing prediction
- URL: http://arxiv.org/abs/2109.00953v1
- Date: Thu, 2 Sep 2021 13:54:02 GMT
- Title: TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions
and U-GRUs for skeletal pedestrian crossing prediction
- Authors: Joseph Gesnouin, Steve Pechberti, Bogdan Stanciulescu and Fabien
Moutarde
- Abstract summary: We address pedestrian crossing prediction in urban traffic environments by linking the dynamics of a pedestrian's skeleton to a binary crossing intention.
We introduce TrouSPI-Net: a context-free, lightweight, lightweight predictor.
We evaluate TrouSPI-Net and analyze its performance.
- Score: 1.911678487931003
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding the behaviors and intentions of pedestrians is still one of the
main challenges for vehicle autonomy, as accurate predictions of their
intentions can guarantee their safety and driving comfort of vehicles. In this
paper, we address pedestrian crossing prediction in urban traffic environments
by linking the dynamics of a pedestrian's skeleton to a binary crossing
intention. We introduce TrouSPI-Net: a context-free, lightweight, multi-branch
predictor. TrouSPI-Net extracts spatio-temporal features for different time
resolutions by encoding pseudo-images sequences of skeletal joints' positions
and processes them with parallel attention modules and atrous convolutions. The
proposed approach is then enhanced by processing features such as relative
distances of skeletal joints, bounding box positions, or ego-vehicle speed with
U-GRUs. Using the newly proposed evaluation procedures for two large public
naturalistic data sets for studying pedestrian behavior in traffic: JAAD and
PIE, we evaluate TrouSPI-Net and analyze its performance. Experimental results
show that TrouSPI-Net achieved 0.76 F1 score on JAAD and 0.80 F1 score on PIE,
therefore outperforming current state-of-the-art while being lightweight and
context-free.
Related papers
- Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - Learning Pedestrian Actions to Ensure Safe Autonomous Driving [12.440017892152417]
It is critical for Autonomous Vehicles to have the ability to predict pedestrians' short-term and immediate actions in real-time.
In this work, a novel multi-task sequence to sequence Transformer encoders-decoders (TF-ed) architecture is proposed for pedestrian action and trajectory prediction.
The proposed approach is compared against an existing LSTM encoders decoders (LSTM-ed) architecture for action and trajectory prediction.
arXiv Detail & Related papers (2023-05-22T14:03:38Z) - Local and Global Contextual Features Fusion for Pedestrian Intention
Prediction [2.203209457340481]
We analyse and analyse visual features of both pedestrian and traffic contexts.
To understand the global context, we utilise location, motion, and environmental information.
These multi-modality features are intelligently fused for effective intention learning.
arXiv Detail & Related papers (2023-05-01T22:37:31Z) - Pedestrian Stop and Go Forecasting with Hybrid Feature Fusion [87.77727495366702]
We introduce the new task of pedestrian stop and go forecasting.
Considering the lack of suitable existing datasets for it, we release TRANS, a benchmark for explicitly studying the stop and go behaviors of pedestrians in urban traffic.
We build it from several existing datasets annotated with pedestrians' walking motions, in order to have various scenarios and behaviors.
arXiv Detail & Related papers (2022-03-04T18:39:31Z) - Pedestrian Trajectory Prediction via Spatial Interaction Transformer
Network [7.150832716115448]
In traffic scenes, when encountering with oncoming people, pedestrians may make sudden turns or stop immediately.
To predict such unpredictable trajectories, we can gain insights into the interaction between pedestrians.
We present a novel generative method named Spatial Interaction Transformer (SIT), which learns the correlation of pedestrian trajectories through attention mechanisms.
arXiv Detail & Related papers (2021-12-13T13:08:04Z) - Real Time Monocular Vehicle Velocity Estimation using Synthetic Data [78.85123603488664]
We look at the problem of estimating the velocity of road vehicles from a camera mounted on a moving car.
We propose a two-step approach where first an off-the-shelf tracker is used to extract vehicle bounding boxes and then a small neural network is used to regress the vehicle velocity.
arXiv Detail & Related papers (2021-09-16T13:10:27Z) - Predicting Pedestrian Crossing Intention with Feature Fusion and
Spatio-Temporal Attention [0.0]
Pedestrian crossing intention should be recognized in real-time for urban driving.
Recent works have shown the potential of using vision-based deep neural network models for this task.
This work introduces a neural network architecture to fuse inherently different novel-temporal features for pedestrian crossing intention prediction.
arXiv Detail & Related papers (2021-04-12T14:10:25Z) - Safety-Oriented Pedestrian Motion and Scene Occupancy Forecasting [91.69900691029908]
We advocate for predicting both the individual motions as well as the scene occupancy map.
We propose a Scene-Actor Graph Neural Network (SA-GNN) which preserves the relative spatial information of pedestrians.
On two large-scale real-world datasets, we showcase that our scene-occupancy predictions are more accurate and better calibrated than those from state-of-the-art motion forecasting methods.
arXiv Detail & Related papers (2021-01-07T06:08:21Z) - PnPNet: End-to-End Perception and Prediction with Tracking in the Loop [82.97006521937101]
We tackle the problem of joint perception and motion forecasting in the context of self-driving vehicles.
We propose Net, an end-to-end model that takes as input sensor data, and outputs at each time step object tracks and their future level.
arXiv Detail & Related papers (2020-05-29T17:57:25Z) - FuSSI-Net: Fusion of Spatio-temporal Skeletons for Intention Prediction
Network [3.5581822321535785]
We develop an end-to-end pedestrian intention framework that performs well on day- and night-time scenarios.
Our framework relies on objection detection bounding boxes combined with skeletal features of human pose.
arXiv Detail & Related papers (2020-05-15T21:52:42Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.