Efficient Mixture-of-Expert for Video-based Driver State and Physiological Multi-task Estimation in Conditional Autonomous Driving
- URL: http://arxiv.org/abs/2410.21086v1
- Date: Mon, 28 Oct 2024 14:49:18 GMT
- Title: Efficient Mixture-of-Expert for Video-based Driver State and Physiological Multi-task Estimation in Conditional Autonomous Driving
- Authors: Jiyao Wang, Xiao Yang, Zhenyu Wang, Ximeng Wei, Ange Wang, Dengbo He, Kaishun Wu,
- Abstract summary: Road safety remains a critical challenge worldwide, with approximately 1.35 million fatalities annually attributed to traffic accidents.
We propose a novel multi-task DMS, termed VDMoE, which leverages RGB video input to monitor driver states non-invasively.
- Score: 12.765198683804094
- License:
- Abstract: Road safety remains a critical challenge worldwide, with approximately 1.35 million fatalities annually attributed to traffic accidents, often due to human errors. As we advance towards higher levels of vehicle automation, challenges still exist, as driving with automation can cognitively over-demand drivers if they engage in non-driving-related tasks (NDRTs), or lead to drowsiness if driving was the sole task. This calls for the urgent need for an effective Driver Monitoring System (DMS) that can evaluate cognitive load and drowsiness in SAE Level-2/3 autonomous driving contexts. In this study, we propose a novel multi-task DMS, termed VDMoE, which leverages RGB video input to monitor driver states non-invasively. By utilizing key facial features to minimize computational load and integrating remote Photoplethysmography (rPPG) for physiological insights, our approach enhances detection accuracy while maintaining efficiency. Additionally, we optimize the Mixture-of-Experts (MoE) framework to accommodate multi-modal inputs and improve performance across different tasks. A novel prior-inclusive regularization method is introduced to align model outputs with statistical priors, thus accelerating convergence and mitigating overfitting risks. We validate our method with the creation of a new dataset (MCDD), which comprises RGB video and physiological indicators from 42 participants, and two public datasets. Our findings demonstrate the effectiveness of VDMoE in monitoring driver states, contributing to safer autonomous driving systems. The code and data will be released.
Related papers
- G-MEMP: Gaze-Enhanced Multimodal Ego-Motion Prediction in Driving [71.9040410238973]
We focus on inferring the ego trajectory of a driver's vehicle using their gaze data.
Next, we develop G-MEMP, a novel multimodal ego-trajectory prediction network that combines GPS and video input with gaze data.
The results show that G-MEMP significantly outperforms state-of-the-art methods in both benchmarks.
arXiv Detail & Related papers (2023-12-13T23:06:30Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - Towards Safe Autonomy in Hybrid Traffic: Detecting Unpredictable
Abnormal Behaviors of Human Drivers via Information Sharing [21.979007506007733]
We show that our proposed algorithm has great detection performance in both highway and urban traffic.
The best performance achieves detection rate of 97.3%, average detection delay of 1.2s, and 0 false alarm.
arXiv Detail & Related papers (2023-08-23T18:24:28Z) - A Novel Driver Distraction Behavior Detection Method Based on
Self-supervised Learning with Masked Image Modeling [5.1680226874942985]
Driver distraction causes a significant number of traffic accidents every year, resulting in economic losses and casualties.
Driver distraction detection primarily relies on traditional convolutional neural networks (CNN) and supervised learning methods.
This paper proposes a new self-supervised learning method based on masked image modeling for driver distraction behavior detection.
arXiv Detail & Related papers (2023-06-01T10:53:32Z) - Generative AI-empowered Simulation for Autonomous Driving in Vehicular
Mixed Reality Metaverses [130.15554653948897]
In vehicular mixed reality (MR) Metaverse, distance between physical and virtual entities can be overcome.
Large-scale traffic and driving simulation via realistic data collection and fusion from the physical world is difficult and costly.
We propose an autonomous driving architecture, where generative AI is leveraged to synthesize unlimited conditioned traffic and driving data in simulations.
arXiv Detail & Related papers (2023-02-16T16:54:10Z) - FBLNet: FeedBack Loop Network for Driver Attention Prediction [75.83518507463226]
Nonobjective driving experience is difficult to model.
In this paper, we propose a FeedBack Loop Network (FBLNet) which attempts to model the driving experience accumulation procedure.
Under the guidance of the incremental knowledge, our model fuses the CNN feature and Transformer feature that are extracted from the input image to predict driver attention.
arXiv Detail & Related papers (2022-12-05T08:25:09Z) - Augmented Driver Behavior Models for High-Fidelity Simulation Study of
Crash Detection Algorithms [2.064612766965483]
We present a simulation platform for a hybrid transportation system that includes both human-driven and automated vehicles.
We decompose the human driving task and offer a modular approach to simulating a large-scale traffic scenario.
We analyze a large driving dataset to extract expressive parameters that would best describe different driving characteristics.
arXiv Detail & Related papers (2022-08-10T19:59:16Z) - Multi-Modal Fusion Transformer for End-to-End Autonomous Driving [59.60483620730437]
We propose TransFuser, a novel Multi-Modal Fusion Transformer, to integrate image and LiDAR representations using attention.
Our approach achieves state-of-the-art driving performance while reducing collisions by 76% compared to geometry-based fusion.
arXiv Detail & Related papers (2021-04-19T11:48:13Z) - DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention
and Alertness Analysis [54.198237164152786]
Vision is the richest and most cost-effective technology for Driver Monitoring Systems (DMS)
The lack of sufficiently large and comprehensive datasets is currently a bottleneck for the progress of DMS development.
In this paper, we introduce the Driver Monitoring dataset (DMD), an extensive dataset which includes real and simulated driving scenarios.
arXiv Detail & Related papers (2020-08-27T12:33:54Z) - Deep Reinforcement Learning for Human-Like Driving Policies in Collision
Avoidance Tasks of Self-Driving Cars [1.160208922584163]
We introduce a model-free, deep reinforcement learning approach to generate automated human-like driving policies.
We study a static obstacle avoidance task on a two-lane highway road in simulation.
We demonstrate that our approach leads to human-like driving policies.
arXiv Detail & Related papers (2020-06-07T18:20:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.