Predicting Pedestrian Crossing Intention with Feature Fusion and
Spatio-Temporal Attention
- URL: http://arxiv.org/abs/2104.05485v1
- Date: Mon, 12 Apr 2021 14:10:25 GMT
- Title: Predicting Pedestrian Crossing Intention with Feature Fusion and
Spatio-Temporal Attention
- Authors: Dongfang Yang, Haolin Zhang, Ekim Yurtsever, Keith Redmill, \"Umit
\"Ozg\"uner
- Abstract summary: Pedestrian crossing intention should be recognized in real-time for urban driving.
Recent works have shown the potential of using vision-based deep neural network models for this task.
This work introduces a neural network architecture to fuse inherently different novel-temporal features for pedestrian crossing intention prediction.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predicting vulnerable road user behavior is an essential prerequisite for
deploying Automated Driving Systems (ADS) in the real-world. Pedestrian
crossing intention should be recognized in real-time, especially for urban
driving. Recent works have shown the potential of using vision-based deep
neural network models for this task. However, these models are not robust and
certain issues still need to be resolved. First, the global spatio-temproal
context that accounts for the interaction between the target pedestrian and the
scene has not been properly utilized. Second, the optimum strategy for fusing
different sensor data has not been thoroughly investigated. This work addresses
the above limitations by introducing a novel neural network architecture to
fuse inherently different spatio-temporal features for pedestrian crossing
intention prediction. We fuse different phenomena such as sequences of RGB
imagery, semantic segmentation masks, and ego-vehicle speed in an optimum way
using attention mechanisms and a stack of recurrent neural networks. The
optimum architecture was obtained through exhaustive ablation and comparison
studies. Extensive comparative experiments on the JAAD pedestrian action
prediction benchmark demonstrate the effectiveness of the proposed method,
where state-of-the-art performance was achieved. Our code is open-source and
publicly available.
Related papers
- Interactive Autonomous Navigation with Internal State Inference and
Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework.
These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents.
Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - Learning Pedestrian Actions to Ensure Safe Autonomous Driving [12.440017892152417]
It is critical for Autonomous Vehicles to have the ability to predict pedestrians' short-term and immediate actions in real-time.
In this work, a novel multi-task sequence to sequence Transformer encoders-decoders (TF-ed) architecture is proposed for pedestrian action and trajectory prediction.
The proposed approach is compared against an existing LSTM encoders decoders (LSTM-ed) architecture for action and trajectory prediction.
arXiv Detail & Related papers (2023-05-22T14:03:38Z) - Pedestrian Trajectory Prediction via Spatial Interaction Transformer
Network [7.150832716115448]
In traffic scenes, when encountering with oncoming people, pedestrians may make sudden turns or stop immediately.
To predict such unpredictable trajectories, we can gain insights into the interaction between pedestrians.
We present a novel generative method named Spatial Interaction Transformer (SIT), which learns the correlation of pedestrian trajectories through attention mechanisms.
arXiv Detail & Related papers (2021-12-13T13:08:04Z) - TrouSPI-Net: Spatio-temporal attention on parallel atrous convolutions
and U-GRUs for skeletal pedestrian crossing prediction [1.911678487931003]
We address pedestrian crossing prediction in urban traffic environments by linking the dynamics of a pedestrian's skeleton to a binary crossing intention.
We introduce TrouSPI-Net: a context-free, lightweight, lightweight predictor.
We evaluate TrouSPI-Net and analyze its performance.
arXiv Detail & Related papers (2021-09-02T13:54:02Z) - SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory
Prediction [64.16212996247943]
We present a Sparse Graph Convolution Network(SGCN) for pedestrian trajectory prediction.
Specifically, the SGCN explicitly models the sparse directed interaction with a sparse directed spatial graph to capture adaptive interaction pedestrians.
visualizations indicate that our method can capture adaptive interactions between pedestrians and their effective motion tendencies.
arXiv Detail & Related papers (2021-04-04T03:17:42Z) - Spatio-Temporal Look-Ahead Trajectory Prediction using Memory Neural
Network [6.065344547161387]
This paper attempts to solve the problem of Spatio-temporal look-ahead trajectory prediction using a novel recurrent neural network called the Memory Neuron Network.
The proposed model is computationally less intensive and has a simple architecture as compared to other deep learning models that utilize LSTMs and GRUs.
arXiv Detail & Related papers (2021-02-24T05:02:19Z) - IntentNet: Learning to Predict Intention from Raw Sensor Data [86.74403297781039]
In this paper, we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the environment.
Our multi-task model achieves better accuracy than the respective separate modules while saving computation, which is critical to reducing reaction time in self-driving applications.
arXiv Detail & Related papers (2021-01-20T00:31:52Z) - Pedestrian Action Anticipation using Contextual Feature Fusion in
Stacked RNNs [19.13270454742958]
We propose a solution for the problem of pedestrian action anticipation at the point of crossing.
Our approach uses a novel stacked RNN architecture in which information collected from various sources, both scene dynamics and visual features, is gradually fused into the network.
arXiv Detail & Related papers (2020-05-13T20:59:37Z) - A Spatial-Temporal Attentive Network with Spatial Continuity for
Trajectory Prediction [74.00750936752418]
We propose a novel model named spatial-temporal attentive network with spatial continuity (STAN-SC)
First, spatial-temporal attention mechanism is presented to explore the most useful and important information.
Second, we conduct a joint feature sequence based on the sequence and instant state information to make the generative trajectories keep spatial continuity.
arXiv Detail & Related papers (2020-03-13T04:35:50Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.