Context-empowered Visual Attention Prediction in Pedestrian Scenarios
- URL: http://arxiv.org/abs/2210.16933v1
- Date: Sun, 30 Oct 2022 19:38:17 GMT
- Title: Context-empowered Visual Attention Prediction in Pedestrian Scenarios
- Authors: Igor Vozniak, Philipp Mueller, Lorena Hell, Nils Lipp, Ahmed
Abouelazm, Christian Mueller
- Abstract summary: We present Context-SalNET, a novel encoder-decoder architecture that addresses three key challenges of visual attention prediction in pedestrians.
First, Context-SalNET explicitly models the context factors urgency and safety preference in the latent space of the encoder-decoder model.
Second, we propose the exponentially weighted mean squared error loss (ew-MSE) that is able to better cope with the fact that only a small part of the ground truth saliency maps consist of non-zero entries.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Effective and flexible allocation of visual attention is key for pedestrians
who have to navigate to a desired goal under different conditions of urgency
and safety preferences. While automatic modelling of pedestrian attention holds
great promise to improve simulations of pedestrian behavior, current saliency
prediction approaches mostly focus on generic free-viewing scenarios and do not
reflect the specific challenges present in pedestrian attention prediction. In
this paper, we present Context-SalNET, a novel encoder-decoder architecture
that explicitly addresses three key challenges of visual attention prediction
in pedestrians: First, Context-SalNET explicitly models the context factors
urgency and safety preference in the latent space of the encoder-decoder model.
Second, we propose the exponentially weighted mean squared error loss (ew-MSE)
that is able to better cope with the fact that only a small part of the ground
truth saliency maps consist of non-zero entries. Third, we explicitly model
epistemic uncertainty to account for the fact that training data for pedestrian
attention prediction is limited. To evaluate Context-SalNET, we recorded the
first dataset of pedestrian visual attention in VR that includes explicit
variation of the context factors urgency and safety preference. Context-SalNET
achieves clear improvements over state-of-the-art saliency prediction
approaches as well as over ablations. Our novel dataset will be made fully
available and can serve as a valuable resource for further research on
pedestrian attention prediction.
Related papers
- Feature Importance in Pedestrian Intention Prediction: A Context-Aware Review [9.475536008455133]
Recent advancements in predicting pedestrian crossing intentions for Autonomous Vehicles using Computer Vision and Deep Neural Networks are promising.
We introduce Context-aware Permutation Feature Importance (CAPFI), a novel approach tailored for pedestrian intention prediction.
CAPFI enables more interpretability and reliable assessments of feature importance by leveraging subdivided scenario contexts.
arXiv Detail & Related papers (2024-09-11T22:13:01Z) - OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising [49.86409475232849]
Trajectory prediction is fundamental in computer vision and autonomous driving.
Existing approaches in this field often assume precise and complete observational data.
We present a novel method for out-of-sight trajectory prediction that leverages a vision-positioning technique.
arXiv Detail & Related papers (2024-04-02T18:30:29Z) - Implicit Occupancy Flow Fields for Perception and Prediction in
Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants.
Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene.
This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z) - Interpretable Self-Aware Neural Networks for Robust Trajectory
Prediction [50.79827516897913]
We introduce an interpretable paradigm for trajectory prediction that distributes the uncertainty among semantic concepts.
We validate our approach on real-world autonomous driving data, demonstrating superior performance over state-of-the-art baselines.
arXiv Detail & Related papers (2022-11-16T06:28:20Z) - Pedestrian 3D Bounding Box Prediction [83.7135926821794]
We focus on 3D bounding boxes, which are reasonable estimates of humans without modeling complex motion details for autonomous vehicles.
We suggest this new problem and present a simple yet effective model for pedestrians' 3D bounding box prediction.
This method follows an encoder-decoder architecture based on recurrent neural networks.
arXiv Detail & Related papers (2022-06-28T17:59:45Z) - Predicting Pedestrian Crossing Intention with Feature Fusion and
Spatio-Temporal Attention [0.0]
Pedestrian crossing intention should be recognized in real-time for urban driving.
Recent works have shown the potential of using vision-based deep neural network models for this task.
This work introduces a neural network architecture to fuse inherently different novel-temporal features for pedestrian crossing intention prediction.
arXiv Detail & Related papers (2021-04-12T14:10:25Z) - Safety-Oriented Pedestrian Motion and Scene Occupancy Forecasting [91.69900691029908]
We advocate for predicting both the individual motions as well as the scene occupancy map.
We propose a Scene-Actor Graph Neural Network (SA-GNN) which preserves the relative spatial information of pedestrians.
On two large-scale real-world datasets, we showcase that our scene-occupancy predictions are more accurate and better calibrated than those from state-of-the-art motion forecasting methods.
arXiv Detail & Related papers (2021-01-07T06:08:21Z) - PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction
in 3D [10.580548257913843]
We propose a new pedestrian action prediction dataset created by adding per-frame 2D/3D bounding box and behavioral annotations to nuScenes.
In addition, we propose a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action.
arXiv Detail & Related papers (2020-12-14T18:13:44Z) - A Real-Time Predictive Pedestrian Collision Warning Service for
Cooperative Intelligent Transportation Systems Using 3D Pose Estimation [10.652350454373531]
We propose a real-time predictive pedestrian collision warning service (P2CWS) for two tasks: pedestrian orientation recognition (100.53 FPS) and intention prediction (35.76 FPS)
Our framework obtains satisfying generalization over multiple sites because of the proposed site-independent features.
The proposed vision framework realizes 89.3% accuracy in the behavior recognition task on the TUD dataset without any training process.
arXiv Detail & Related papers (2020-09-23T00:55:12Z) - Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications.
In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent.
Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.