Driver-Net: Multi-Camera Fusion for Assessing Driver Take-Over Readiness in Automated Vehicles
- URL: http://arxiv.org/abs/2507.04139v1
- Date: Sat, 05 Jul 2025 19:27:03 GMT
- Title: Driver-Net: Multi-Camera Fusion for Assessing Driver Take-Over Readiness in Automated Vehicles
- Authors: Mahdi Rezaei, Mohsen Azarmi,
- Abstract summary: Driver-Net is a novel deep learning framework that fuses multi-camera inputs to estimate driver take-over readiness.<n>It captures synchronised visual cues from the driver's head, hands, and body posture through a triple-camera setup.<n>The proposed method achieves an accuracy of up to 95.8% in driver readiness classification.
- Score: 3.637162892228131
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ensuring safe transition of control in automated vehicles requires an accurate and timely assessment of driver readiness. This paper introduces Driver-Net, a novel deep learning framework that fuses multi-camera inputs to estimate driver take-over readiness. Unlike conventional vision-based driver monitoring systems that focus on head pose or eye gaze, Driver-Net captures synchronised visual cues from the driver's head, hands, and body posture through a triple-camera setup. The model integrates spatio-temporal data using a dual-path architecture, comprising a Context Block and a Feature Block, followed by a cross-modal fusion strategy to enhance prediction accuracy. Evaluated on a diverse dataset collected from the University of Leeds Driving Simulator, the proposed method achieves an accuracy of up to 95.8% in driver readiness classification. This performance significantly enhances existing approaches and highlights the importance of multimodal and multi-view fusion. As a real-time, non-intrusive solution, Driver-Net contributes meaningfully to the development of safer and more reliable automated vehicles and aligns with new regulatory mandates and upcoming safety standards.
Related papers
- SafeAuto: Knowledge-Enhanced Safe Autonomous Driving with Multimodal Foundation Models [63.71984266104757]
We propose SafeAuto, a framework that enhances MLLM-based autonomous driving by incorporating both unstructured and structured knowledge.<n>To explicitly integrate safety knowledge, we develop a reasoning component that translates traffic rules into first-order logic.<n>Our Multimodal Retrieval-Augmented Generation model leverages video, control signals, and environmental attributes to learn from past driving experiences.
arXiv Detail & Related papers (2025-02-28T21:53:47Z) - Driver Assistance System Based on Multimodal Data Hazard Detection [0.0]
This paper proposes a multimodal driver assistance detection system.<n>It integrates road condition video, driver facial video, and audio data to enhance incident recognition accuracy.
arXiv Detail & Related papers (2025-02-05T09:02:39Z) - Cross-Camera Distracted Driver Classification through Feature Disentanglement and Contrastive Learning [13.613407983544427]
Driver Behavior Monitoring Network (DBMNet) relies on a lightweight backbone and integrates a disentanglement module to discard camera view information.<n>DBMNet achieves an improvement of 7% in Top-1 accuracy compared to existing approaches.
arXiv Detail & Related papers (2024-11-20T10:27:12Z) - Efficient Mixture-of-Expert for Video-based Driver State and Physiological Multi-task Estimation in Conditional Autonomous Driving [12.765198683804094]
Road safety remains a critical challenge worldwide, with approximately 1.35 million fatalities annually attributed to traffic accidents.<n>We propose a novel multi-task DMS, termed VDMoE, which leverages RGB video input to monitor driver states non-invasively.
arXiv Detail & Related papers (2024-10-28T14:49:18Z) - Evaluating Driver Readiness in Conditionally Automated Vehicles from
Eye-Tracking Data and Head Pose [3.637162892228131]
In SAE Level 3 or partly automated vehicles, the driver needs to be available and ready to intervene when necessary.
This article presents a comprehensive analysis of driver readiness assessment by combining head pose features and eye-tracking data.
A Bidirectional LSTM architecture, combining both feature sets, achieves a mean absolute error of 0.363 on the DMD dataset.
arXiv Detail & Related papers (2024-01-20T17:32:52Z) - Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view.<n>To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - Infrastructure-based End-to-End Learning and Prevention of Driver
Failure [68.0478623315416]
FailureNet is a recurrent neural network trained end-to-end on trajectories of both nominal and reckless drivers in a scaled miniature city.
It can accurately identify control failures, upstream perception errors, and speeding drivers, distinguishing them from nominal driving.
Compared to speed or frequency-based predictors, FailureNet's recurrent neural network structure provides improved predictive power, yielding upwards of 84% accuracy when deployed on hardware.
arXiv Detail & Related papers (2023-03-21T22:55:51Z) - FBLNet: FeedBack Loop Network for Driver Attention Prediction [50.936478241688114]
Nonobjective driving experience is difficult to model, so a mechanism simulating the driver experience accumulation procedure is absent in existing methods.<n>We propose a FeedBack Loop Network (FBLNet), which attempts to model the driving experience accumulation procedure.<n>Our model exhibits a solid advantage over existing methods, achieving an outstanding performance improvement on two driver attention benchmark datasets.
arXiv Detail & Related papers (2022-12-05T08:25:09Z) - COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked
Vehicles [54.61668577827041]
We introduce COOPERNAUT, an end-to-end learning model that uses cross-vehicle perception for vision-based cooperative driving.
Our experiments on AutoCastSim suggest that our cooperative perception driving models lead to a 40% improvement in average success rate.
arXiv Detail & Related papers (2022-05-04T17:55:12Z) - Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention.
Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z) - DeepTake: Prediction of Driver Takeover Behavior using Multimodal Data [17.156611944404883]
We present DeepTake, a novel deep neural network-based framework that predicts multiple aspects of takeover behavior.
Using features from vehicle data, driver biometrics, and subjective measurements, DeepTake predicts the driver's intention, time, and quality of takeover.
Results show that DeepTake reliably predicts the takeover intention, time, and quality, with an accuracy of 96%, 93%, and 83%, respectively.
arXiv Detail & Related papers (2020-12-31T04:24:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.