DSDFormer: An Innovative Transformer-Mamba Framework for Robust High-Precision Driver Distraction Identification
- URL: http://arxiv.org/abs/2409.05587v2
- Date: Thu, 12 Sep 2024 15:24:44 GMT
- Title: DSDFormer: An Innovative Transformer-Mamba Framework for Robust High-Precision Driver Distraction Identification
- Authors: Junzhou Chen, Zirui Zhang, Jing Yu, Heqiang Huang, Ronghui Zhang, Xuemiao Xu, Bin Sheng, Hong Yan,
- Abstract summary: Driver distraction remains a leading cause of traffic accidents, posing a critical threat to road safety globally.
We propose DSDFormer, a framework that integrates the strengths of Transformer and Mamba architectures.
We also introduce Temporal Reasoning Confident Learning (TRCL), an unsupervised approach that refines noisy labels by leveragingtemporal correlations in video.
- Score: 23.05821759499963
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Driver distraction remains a leading cause of traffic accidents, posing a critical threat to road safety globally. As intelligent transportation systems evolve, accurate and real-time identification of driver distraction has become essential. However, existing methods struggle to capture both global contextual and fine-grained local features while contending with noisy labels in training datasets. To address these challenges, we propose DSDFormer, a novel framework that integrates the strengths of Transformer and Mamba architectures through a Dual State Domain Attention (DSDA) mechanism, enabling a balance between long-range dependencies and detailed feature extraction for robust driver behavior recognition. Additionally, we introduce Temporal Reasoning Confident Learning (TRCL), an unsupervised approach that refines noisy labels by leveraging spatiotemporal correlations in video sequences. Our model achieves state-of-the-art performance on the AUC-V1, AUC-V2, and 100-Driver datasets and demonstrates real-time processing efficiency on the NVIDIA Jetson AGX Orin platform. Extensive experimental results confirm that DSDFormer and TRCL significantly improve both the accuracy and robustness of driver distraction detection, offering a scalable solution to enhance road safety.
Related papers
- Efficient Mixture-of-Expert for Video-based Driver State and Physiological Multi-task Estimation in Conditional Autonomous Driving [12.765198683804094]
Road safety remains a critical challenge worldwide, with approximately 1.35 million fatalities annually attributed to traffic accidents.
We propose a novel multi-task DMS, termed VDMoE, which leverages RGB video input to monitor driver states non-invasively.
arXiv Detail & Related papers (2024-10-28T14:49:18Z) - Unified End-to-End V2X Cooperative Autonomous Driving [21.631099800753795]
UniE2EV2X is a V2X-integrated end-to-end autonomous driving system that consolidates key driving modules within a unified network.
The framework employs a deformable attention-based data fusion strategy, effectively facilitating cooperation between vehicles and infrastructure.
We implement the UniE2EV2X framework on the challenging DeepAccident, a simulation dataset designed for V2X cooperative driving.
arXiv Detail & Related papers (2024-05-07T03:01:40Z) - Reinforcement Learning with Latent State Inference for Autonomous On-ramp Merging under Observation Delay [6.0111084468944]
We introduce the Lane-keeping, Lane-changing with Latent-state Inference and Safety Controller (L3IS) agent.
L3IS is designed to perform the on-ramp merging task safely without comprehensive knowledge about surrounding vehicles' intents or driving styles.
We present an augmentation of this agent called AL3IS that accounts for observation delays, allowing the agent to make more robust decisions in real-world environments.
arXiv Detail & Related papers (2024-03-18T15:02:46Z) - V2X-Lead: LiDAR-based End-to-End Autonomous Driving with
Vehicle-to-Everything Communication Integration [4.166623313248682]
This paper presents a LiDAR-based end-to-end autonomous driving method with Vehicle-to-Everything (V2X) communication integration.
The proposed method aims to handle imperfect partial observations by fusing the onboard LiDAR sensor and V2X communication data.
arXiv Detail & Related papers (2023-09-26T20:26:03Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Confidence Attention and Generalization Enhanced Distillation for
Continuous Video Domain Adaptation [62.458968086881555]
Continuous Video Domain Adaptation (CVDA) is a scenario where a source model is required to adapt to a series of individually available changing target domains.
We propose a Confidence-Attentive network with geneRalization enhanced self-knowledge disTillation (CART) to address the challenge in CVDA.
arXiv Detail & Related papers (2023-03-18T16:40:10Z) - FBLNet: FeedBack Loop Network for Driver Attention Prediction [75.83518507463226]
Nonobjective driving experience is difficult to model.
In this paper, we propose a FeedBack Loop Network (FBLNet) which attempts to model the driving experience accumulation procedure.
Under the guidance of the incremental knowledge, our model fuses the CNN feature and Transformer feature that are extracted from the input image to predict driver attention.
arXiv Detail & Related papers (2022-12-05T08:25:09Z) - Integrated Decision and Control for High-Level Automated Vehicles by
Mixed Policy Gradient and Its Experiment Verification [10.393343763237452]
This paper presents a self-evolving decision-making system based on the Integrated Decision and Control (IDC)
An RL algorithm called constrained mixed policy gradient (CMPG) is proposed to consistently upgrade the driving policy of the IDC.
Experiment results show that boosting by data, the system can achieve better driving ability over model-based methods.
arXiv Detail & Related papers (2022-10-19T14:58:41Z) - Transferable Deep Reinforcement Learning Framework for Autonomous
Vehicles with Joint Radar-Data Communications [69.24726496448713]
We propose an intelligent optimization framework based on the Markov Decision Process (MDP) to help the AV make optimal decisions.
We then develop an effective learning algorithm leveraging recent advances of deep reinforcement learning techniques to find the optimal policy for the AV.
We show that the proposed transferable deep reinforcement learning framework reduces the obstacle miss detection probability by the AV up to 67% compared to other conventional deep reinforcement learning approaches.
arXiv Detail & Related papers (2021-05-28T08:45:37Z) - Efficient and Robust LiDAR-Based End-to-End Navigation [132.52661670308606]
We present an efficient and robust LiDAR-based end-to-end navigation framework.
We propose Fast-LiDARNet that is based on sparse convolution kernel optimization and hardware-aware model design.
We then propose Hybrid Evidential Fusion that directly estimates the uncertainty of the prediction from only a single forward pass.
arXiv Detail & Related papers (2021-05-20T17:52:37Z) - DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention
and Alertness Analysis [54.198237164152786]
Vision is the richest and most cost-effective technology for Driver Monitoring Systems (DMS)
The lack of sufficiently large and comprehensive datasets is currently a bottleneck for the progress of DMS development.
In this paper, we introduce the Driver Monitoring dataset (DMD), an extensive dataset which includes real and simulated driving scenarios.
arXiv Detail & Related papers (2020-08-27T12:33:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.