A Novel Driver Distraction Behavior Detection Method Based on
Self-supervised Learning with Masked Image Modeling
- URL: http://arxiv.org/abs/2306.00543v4
- Date: Thu, 13 Jul 2023 14:47:42 GMT
- Title: A Novel Driver Distraction Behavior Detection Method Based on
Self-supervised Learning with Masked Image Modeling
- Authors: Yingzhi Zhang, Taiguo Li, Chao Li and Xinghong Zhou
- Abstract summary: Driver distraction causes a significant number of traffic accidents every year, resulting in economic losses and casualties.
Driver distraction detection primarily relies on traditional convolutional neural networks (CNN) and supervised learning methods.
This paper proposes a new self-supervised learning method based on masked image modeling for driver distraction behavior detection.
- Score: 5.1680226874942985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Driver distraction causes a significant number of traffic accidents every
year, resulting in economic losses and casualties. Currently, the level of
automation in commercial vehicles is far from completely unmanned, and drivers
still play an important role in operating and controlling the vehicle.
Therefore, driver distraction behavior detection is crucial for road safety. At
present, driver distraction detection primarily relies on traditional
convolutional neural networks (CNN) and supervised learning methods. However,
there are still challenges such as the high cost of labeled datasets, limited
ability to capture high-level semantic information, and weak generalization
performance. In order to solve these problems, this paper proposes a new
self-supervised learning method based on masked image modeling for driver
distraction behavior detection. Firstly, a self-supervised learning framework
for masked image modeling (MIM) is introduced to solve the serious human and
material consumption issues caused by dataset labeling. Secondly, the Swin
Transformer is employed as an encoder. Performance is enhanced by reconfiguring
the Swin Transformer block and adjusting the distribution of the number of
window multi-head self-attention (W-MSA) and shifted window multi-head
self-attention (SW-MSA) detection heads across all stages, which leads to model
more lightening. Finally, various data augmentation strategies are used along
with the best random masking strategy to strengthen the model's recognition and
generalization ability. Test results on a large-scale driver distraction
behavior dataset show that the self-supervised learning method proposed in this
paper achieves an accuracy of 99.60%, approximating the excellent performance
of advanced supervised learning methods. Our code is publicly available at
github.com/Rocky1salady-killer/SL-DDBD.
Related papers
- AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving [68.73885845181242]
We propose an Automatic Data Engine (AIDE) that automatically identifies issues, efficiently curates data, improves the model through auto-labeling, and verifies the model through generation of diverse scenarios.
We further establish a benchmark for open-world detection on AV datasets to comprehensively evaluate various learning paradigms, demonstrating our method's superior performance at a reduced cost.
arXiv Detail & Related papers (2024-03-26T04:27:56Z) - OpenNet: Incremental Learning for Autonomous Driving Object Detection
with Balanced Loss [3.761247766448379]
The proposed method can obtain better performance than that of the existing methods.
The Experimental results upon the CODA dataset show that the proposed method can obtain better performance than that of the existing methods.
arXiv Detail & Related papers (2023-11-25T06:02:50Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - FBLNet: FeedBack Loop Network for Driver Attention Prediction [75.83518507463226]
Nonobjective driving experience is difficult to model.
In this paper, we propose a FeedBack Loop Network (FBLNet) which attempts to model the driving experience accumulation procedure.
Under the guidance of the incremental knowledge, our model fuses the CNN feature and Transformer feature that are extracted from the input image to predict driver attention.
arXiv Detail & Related papers (2022-12-05T08:25:09Z) - Masked Autoencoding for Scalable and Generalizable Decision Making [93.84855114717062]
MaskDP is a simple and scalable self-supervised pretraining method for reinforcement learning and behavioral cloning.
We find that a MaskDP model gains the capability of zero-shot transfer to new BC tasks, such as single and multiple goal reaching.
arXiv Detail & Related papers (2022-11-23T07:04:41Z) - Monocular Vision-based Prediction of Cut-in Maneuvers with LSTM Networks [0.0]
This study proposes a method to predict potentially dangerous cut-in maneuvers happening in the ego lane.
We follow a computer vision-based approach that only employs a single in-vehicle RGB camera.
Our algorithm consists of a CNN-based vehicle detection and tracking step and an LSTM-based maneuver classification step.
arXiv Detail & Related papers (2022-03-21T02:30:36Z) - An Automated Machine Learning (AutoML) Method for Driving Distraction
Detection Based on Lane-Keeping Performance [2.3951613028271397]
This study proposes a domain-specific automated machine learning (AutoML) to self-learn the optimal models to detect distraction.
The proposed AutoGBM method is found to be reliable and promising to predict phone-related driving distractions.
The purposed AutoGBM not only produces better performance with fewer features; but also provides data-driven insights about system design.
arXiv Detail & Related papers (2021-03-10T12:37:18Z) - Fine-Grained Vehicle Perception via 3D Part-Guided Visual Data
Augmentation [77.60050239225086]
We propose an effective training data generation process by fitting a 3D car model with dynamic parts to vehicles in real images.
Our approach is fully automatic without any human interaction.
We present a multi-task network for VUS parsing and a multi-stream network for VHI parsing.
arXiv Detail & Related papers (2020-12-15T03:03:38Z) - Keep Your AI-es on the Road: Tackling Distracted Driver Detection with
Convolutional Neural Networks and Targeted Data Augmentation [0.0]
Distracted driving is one of the leading cause of motor accidents and deaths in the world.
In our study, we aim to build a robust multi-class classifier to detect and identify different forms of driver inattention.
arXiv Detail & Related papers (2020-06-19T04:56:08Z) - Auto-Rectify Network for Unsupervised Indoor Depth Estimation [119.82412041164372]
We establish that the complex ego-motions exhibited in handheld settings are a critical obstacle for learning depth.
We propose a data pre-processing method that rectifies training images by removing their relative rotations for effective learning.
Our results outperform the previous unsupervised SOTA method by a large margin on the challenging NYUv2 dataset.
arXiv Detail & Related papers (2020-06-04T08:59:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.