Related papers: Context-empowered Visual Attention Prediction in Pedestrian Scenarios

Context-empowered Visual Attention Prediction in Pedestrian Scenarios

URL: http://arxiv.org/abs/2210.16933v1
Date: Sun, 30 Oct 2022 19:38:17 GMT
Title: Context-empowered Visual Attention Prediction in Pedestrian Scenarios
Authors: Igor Vozniak, Philipp Mueller, Lorena Hell, Nils Lipp, Ahmed Abouelazm, Christian Mueller
Abstract summary: We present Context-SalNET, a novel encoder-decoder architecture that addresses three key challenges of visual attention prediction in pedestrians. First, Context-SalNET explicitly models the context factors urgency and safety preference in the latent space of the encoder-decoder model. Second, we propose the exponentially weighted mean squared error loss (ew-MSE) that is able to better cope with the fact that only a small part of the ground truth saliency maps consist of non-zero entries.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Effective and flexible allocation of visual attention is key for pedestrians who have to navigate to a desired goal under different conditions of urgency and safety preferences. While automatic modelling of pedestrian attention holds great promise to improve simulations of pedestrian behavior, current saliency prediction approaches mostly focus on generic free-viewing scenarios and do not reflect the specific challenges present in pedestrian attention prediction. In this paper, we present Context-SalNET, a novel encoder-decoder architecture that explicitly addresses three key challenges of visual attention prediction in pedestrians: First, Context-SalNET explicitly models the context factors urgency and safety preference in the latent space of the encoder-decoder model. Second, we propose the exponentially weighted mean squared error loss (ew-MSE) that is able to better cope with the fact that only a small part of the ground truth saliency maps consist of non-zero entries. Third, we explicitly model epistemic uncertainty to account for the fact that training data for pedestrian attention prediction is limited. To evaluate Context-SalNET, we recorded the first dataset of pedestrian visual attention in VR that includes explicit variation of the context factors urgency and safety preference. Context-SalNET achieves clear improvements over state-of-the-art saliency prediction approaches as well as over ablations. Our novel dataset will be made fully available and can serve as a valuable resource for further research on pedestrian attention prediction.

Related papers

Feature Importance in Pedestrian Intention Prediction: A Context-Aware Review [9.475536008455133]
Recent advancements in predicting pedestrian crossing intentions for Autonomous Vehicles using Computer Vision and Deep Neural Networks are promising. We introduce Context-aware Permutation Feature Importance (CAPFI), a novel approach tailored for pedestrian intention prediction. CAPFI enables more interpretability and reliable assessments of feature importance by leveraging subdivided scenario contexts.
arXiv Detail & Related papers (2024-09-11T22:13:01Z)
OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising [49.86409475232849]
Trajectory prediction is fundamental in computer vision and autonomous driving. Existing approaches in this field often assume precise and complete observational data. We present a novel method for out-of-sight trajectory prediction that leverages a vision-positioning technique.
arXiv Detail & Related papers (2024-04-02T18:30:29Z)
Implicit Occupancy Flow Fields for Perception and Prediction in Self-Driving [68.95178518732965]
A self-driving vehicle (SDV) must be able to perceive its surroundings and predict the future behavior of other traffic participants. Existing works either perform object detection followed by trajectory of the detected objects, or predict dense occupancy and flow grids for the whole scene. This motivates our unified approach to perception and future prediction that implicitly represents occupancy and flow over time with a single neural network.
arXiv Detail & Related papers (2023-08-02T23:39:24Z)
Interpretable Self-Aware Neural Networks for Robust Trajectory Prediction [50.79827516897913]
We introduce an interpretable paradigm for trajectory prediction that distributes the uncertainty among semantic concepts. We validate our approach on real-world autonomous driving data, demonstrating superior performance over state-of-the-art baselines.
arXiv Detail & Related papers (2022-11-16T06:28:20Z)
Pedestrian 3D Bounding Box Prediction [83.7135926821794]
We focus on 3D bounding boxes, which are reasonable estimates of humans without modeling complex motion details for autonomous vehicles. We suggest this new problem and present a simple yet effective model for pedestrians' 3D bounding box prediction. This method follows an encoder-decoder architecture based on recurrent neural networks.
arXiv Detail & Related papers (2022-06-28T17:59:45Z)
Predicting Pedestrian Crossing Intention with Feature Fusion and Spatio-Temporal Attention [0.0]
Pedestrian crossing intention should be recognized in real-time for urban driving. Recent works have shown the potential of using vision-based deep neural network models for this task. This work introduces a neural network architecture to fuse inherently different novel-temporal features for pedestrian crossing intention prediction.
arXiv Detail & Related papers (2021-04-12T14:10:25Z)
Safety-Oriented Pedestrian Motion and Scene Occupancy Forecasting [91.69900691029908]
We advocate for predicting both the individual motions as well as the scene occupancy map. We propose a Scene-Actor Graph Neural Network (SA-GNN) which preserves the relative spatial information of pedestrians. On two large-scale real-world datasets, we showcase that our scene-occupancy predictions are more accurate and better calibrated than those from state-of-the-art motion forecasting methods.
arXiv Detail & Related papers (2021-01-07T06:08:21Z)
PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction in 3D [10.580548257913843]
We propose a new pedestrian action prediction dataset created by adding per-frame 2D/3D bounding box and behavioral annotations to nuScenes. In addition, we propose a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action.
arXiv Detail & Related papers (2020-12-14T18:13:44Z)
A Real-Time Predictive Pedestrian Collision Warning Service for Cooperative Intelligent Transportation Systems Using 3D Pose Estimation [10.652350454373531]
We propose a real-time predictive pedestrian collision warning service (P2CWS) for two tasks: pedestrian orientation recognition (100.53 FPS) and intention prediction (35.76 FPS) Our framework obtains satisfying generalization over multiple sites because of the proposed site-independent features. The proposed vision framework realizes 89.3% accuracy in the behavior recognition task on the TUD dataset without any training process.
arXiv Detail & Related papers (2020-09-23T00:55:12Z)
Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction [57.56466850377598]
Reasoning over visual data is a desirable capability for robotics and vision-based applications. In this paper, we present a framework on graph to uncover relationships in different objects in the scene for reasoning about pedestrian intent. Pedestrian intent, defined as the future action of crossing or not-crossing the street, is a very crucial piece of information for autonomous vehicles.
arXiv Detail & Related papers (2020-02-20T18:50:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.