DriverGaze360: OmniDirectional Driver Attention with Object-Level Guidance
- URL: http://arxiv.org/abs/2512.14266v1
- Date: Tue, 16 Dec 2025 10:23:00 GMT
- Title: DriverGaze360: OmniDirectional Driver Attention with Object-Level Guidance
- Authors: Shreedhar Govil, Didier Stricker, Jason Rambach,
- Abstract summary: DriverGaze360 is a large-scale 360$circ$ field of view driver attention dataset.<n>DriverGaze360-Net jointly learns attention maps and attended objects by employing an auxiliary semantic segmentation head.
- Score: 16.43802401929688
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Predicting driver attention is a critical problem for developing explainable autonomous driving systems and understanding driver behavior in mixed human-autonomous vehicle traffic scenarios. Although significant progress has been made through large-scale driver attention datasets and deep learning architectures, existing works are constrained by narrow frontal field-of-view and limited driving diversity. Consequently, they fail to capture the full spatial context of driving environments, especially during lane changes, turns, and interactions involving peripheral objects such as pedestrians or cyclists. In this paper, we introduce DriverGaze360, a large-scale 360$^\circ$ field of view driver attention dataset, containing $\sim$1 million gaze-labeled frames collected from 19 human drivers, enabling comprehensive omnidirectional modeling of driver gaze behavior. Moreover, our panoramic attention prediction approach, DriverGaze360-Net, jointly learns attention maps and attended objects by employing an auxiliary semantic segmentation head. This improves spatial awareness and attention prediction across wide panoramic inputs. Extensive experiments demonstrate that DriverGaze360-Net achieves state-of-the-art attention prediction performance on multiple metrics on panoramic driving images. Dataset and method available at https://av.dfki.de/drivergaze360.
Related papers
- Where, What, Why: Towards Explainable Driver Attention Prediction [28.677786362573638]
We introduce Explainable Driver Attention Prediction, a novel task paradigm that jointly predicts spatial attention regions (where), parses attended semantics (what), and provides cognitive reasoning for attention allocation (why)<n>We propose LLada, a Large Language model-driven framework for driver attention prediction, which unifies pixel modeling, semantic parsing, and cognitive reasoning within an end-to-end architecture.<n>This work serves as a key step toward a deeper understanding of driver attention mechanisms, with significant implications for autonomous driving, intelligent driver training, and human-computer interaction.
arXiv Detail & Related papers (2025-06-29T04:59:39Z) - AGC-Drive: A Large-Scale Dataset for Real-World Aerial-Ground Collaboration in Driving Scenarios [68.84774511206797]
We present AGC-Drive, the first large-scale real-world dataset for Aerial-Ground Cooperative 3D perception.<n>AGC-Drive contains 350 scenes, each with approximately 100 frames and fully annotated 3D bounding boxes covering 13 object categories.<n>We provide benchmarks for two 3D perception tasks: vehicle-to-vehicle collaborative perception and vehicle-to-Ground collaborative perception.
arXiv Detail & Related papers (2025-06-19T14:48:43Z) - Seeing Beyond Views: Multi-View Driving Scene Video Generation with Holistic Attention [61.3281618482513]
We present CogDriving, a novel network designed for synthesizing high-quality multi-view driving videos.<n>CogDriving leverages a Diffusion Transformer architecture with holistic-4D attention modules, enabling simultaneous associations across the dimensions.<n>CogDriving demonstrates strong performance on the nuScenes validation set, achieving an FVD score of 37.8, highlighting its ability to generate realistic driving videos.
arXiv Detail & Related papers (2024-12-04T18:02:49Z) - Towards Infusing Auxiliary Knowledge for Distracted Driver Detection [11.816566371802802]
Distracted driving is a leading cause of road accidents globally.
We propose KiD3, a novel method for distracted driver detection (DDD) by infusing auxiliary knowledge about semantic relations between entities in a scene and the structural configuration of the driver's pose.
Specifically, we construct a unified framework that integrates the scene graphs, and driver pose information with the visual cues in video frames to create a holistic representation of the driver's actions.
arXiv Detail & Related papers (2024-08-29T15:28:42Z) - Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view.<n>To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - FBLNet: FeedBack Loop Network for Driver Attention Prediction [50.936478241688114]
Nonobjective driving experience is difficult to model, so a mechanism simulating the driver experience accumulation procedure is absent in existing methods.<n>We propose a FeedBack Loop Network (FBLNet), which attempts to model the driving experience accumulation procedure.<n>Our model exhibits a solid advantage over existing methods, achieving an outstanding performance improvement on two driver attention benchmark datasets.
arXiv Detail & Related papers (2022-12-05T08:25:09Z) - CoCAtt: A Cognitive-Conditioned Driver Attention Dataset (Supplementary
Material) [31.888206001447625]
Driver attention prediction can play an instrumental role in mitigating and preventing high-risk events.
We present a new driver attention dataset, CoCAtt.
CoCAtt is the largest and the most diverse driver attention dataset in terms of autonomy levels, eye tracker resolutions, and driving scenarios.
arXiv Detail & Related papers (2022-07-08T17:35:17Z) - CoCAtt: A Cognitive-Conditioned Driver Attention Dataset [16.177399201198636]
Driver attention prediction can play an instrumental role in mitigating and preventing high-risk events.
We present a new driver attention dataset, CoCAtt.
CoCAtt is the largest and the most diverse driver attention dataset in terms of autonomy levels, eye tracker resolutions, and driving scenarios.
arXiv Detail & Related papers (2021-11-19T02:42:34Z) - The Multimodal Driver Monitoring Database: A Naturalistic Corpus to
Study Driver Attention [44.94118128276982]
A smart vehicle should be able to monitor the actions and behaviors of the human driver to provide critical warnings or intervene when necessary.
Recent advancements in deep learning and computer vision have shown great promise in monitoring human behaviors and activities.
A vast amount of in-domain data is required to train models that provide high performance in predicting driving related tasks.
arXiv Detail & Related papers (2020-12-23T16:37:17Z) - Learning Accurate and Human-Like Driving using Semantic Maps and
Attention [152.48143666881418]
This paper investigates how end-to-end driving models can be improved to drive more accurately and human-like.
We exploit semantic and visual maps from HERE Technologies and augment the existing Drive360 dataset with such.
Our models are trained and evaluated on the Drive360 + HERE dataset, which features 60 hours and 3000 km of real-world driving data.
arXiv Detail & Related papers (2020-07-10T22:25:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.