Related papers: PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring

PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring

URL: http://arxiv.org/abs/2507.19172v1
Date: Fri, 25 Jul 2025 11:23:44 GMT
Title: PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring
Authors: Jiyao Wang, Xiao Yang, Qingyong Hu, Jiankai Tang, Can Liu, Dengbo He, Yuntao Wang, Yingcong Chen, Kaishun Wu,
Abstract summary: PhysDrive is the first large-scale multimodal dataset for contactless in-vehicle physiological sensing.<n>It covers a wide spectrum of naturalistic driving conditions, including driver motions, dynamic natural light, vehicle types, and road conditions.
Score: 30.242706543653497
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Robust and unobtrusive in-vehicle physiological monitoring is crucial for ensuring driving safety and user experience. While remote physiological measurement (RPM) offers a promising non-invasive solution, its translation to real-world driving scenarios is critically constrained by the scarcity of comprehensive datasets. Existing resources are often limited in scale, modality diversity, the breadth of biometric annotations, and the range of captured conditions, thereby omitting inherent real-world challenges in driving. Here, we present PhysDrive, the first large-scale multimodal dataset for contactless in-vehicle physiological sensing with dedicated consideration on various modality settings and driving factors. PhysDrive collects data from 48 drivers, including synchronized RGB, near-infrared camera, and raw mmWave radar data, accompanied with six synchronized ground truths (ECG, BVP, Respiration, HR, RR, and SpO2). It covers a wide spectrum of naturalistic driving conditions, including driver motions, dynamic natural light, vehicle types, and road conditions. We extensively evaluate both signal-processing and deep-learning methods on PhysDrive, establishing a comprehensive benchmark across all modalities, and release full open-source code with compatibility for mainstream public toolboxes. We envision PhysDrive will serve as a foundational resource and accelerate research on multimodal driver monitoring and smart-cockpit systems.

Related papers

Visual Dominance and Emerging Multimodal Approaches in Distracted Driving Detection: A Review of Machine Learning Techniques [3.378738346115004]
Distracted driving continues to be a significant cause of road traffic injuries and fatalities worldwide.<n>Recent developments in machine learning (ML) and deep learning (DL) have primarily focused on visual data to detect distraction.<n>This systematic review assesses 74 studies that utilize ML/DL techniques for distracted driving detection across visual, sensor-based, multimodal, and emerging modalities.
arXiv Detail & Related papers (2025-05-04T02:51:00Z)
VTD: Visual and Tactile Database for Driver State and Behavior Perception [1.6277623188953556]
We introduce a novel visual-tactile perception method to address subjective uncertainties in driver state and interaction behaviors.<n>A comprehensive dataset has been developed that encompasses multi-modal data under fatigue and distraction conditions.
arXiv Detail & Related papers (2024-12-06T09:31:40Z)
DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation [54.02069690134526]
We propose DrivingSphere, a realistic and closed-loop simulation framework. Its core idea is to build 4D world representation and generate real-life and controllable driving scenarios. By providing a dynamic and realistic simulation environment, DrivingSphere enables comprehensive testing and validation of autonomous driving algorithms.
arXiv Detail & Related papers (2024-11-18T03:00:33Z)
Efficient Mixture-of-Expert for Video-based Driver State and Physiological Multi-task Estimation in Conditional Autonomous Driving [12.765198683804094]
Road safety remains a critical challenge worldwide, with approximately 1.35 million fatalities annually attributed to traffic accidents.<n>We propose a novel multi-task DMS, termed VDMoE, which leverages RGB video input to monitor driver states non-invasively.
arXiv Detail & Related papers (2024-10-28T14:49:18Z)
Probing Multimodal LLMs as World Models for Driving [72.18727651074563]
We look at the application of Multimodal Large Language Models (MLLMs) in autonomous driving. Despite advances in models like GPT-4o, their performance in complex driving environments remains largely unexplored.
arXiv Detail & Related papers (2024-05-09T17:52:42Z)
Leveraging Driver Field-of-View for Multimodal Ego-Trajectory Prediction [69.29802752614677]
RouteFormer is a novel ego-trajectory prediction network combining GPS data, environmental context, and the driver's field-of-view.<n>To tackle data scarcity and enhance diversity, we introduce GEM, a dataset of urban driving scenarios enriched with synchronized driver field-of-view and gaze data.
arXiv Detail & Related papers (2023-12-13T23:06:30Z)
DriveGPT4: Interpretable End-to-end Autonomous Driving via Large Language Model [84.29836263441136]
This study introduces DriveGPT4, a novel interpretable end-to-end autonomous driving system based on multimodal large language models (MLLMs) DriveGPT4 facilitates the interpretation of vehicle actions, offers pertinent reasoning, and effectively addresses a diverse range of questions posed by users.
arXiv Detail & Related papers (2023-10-02T17:59:52Z)
OpenDriver: An Open-Road Driver State Detection Dataset [13.756530418314227]
This paper develops a large-scale multimodal driving dataset, OpenDriver, for driver state detection.<n>The OpenDriver encompasses a total of 3,278 driving trips, with a signal collection duration spanning approximately 4,600 hours.
arXiv Detail & Related papers (2023-04-09T10:08:38Z)
Generative AI-empowered Simulation for Autonomous Driving in Vehicular Mixed Reality Metaverses [130.15554653948897]
In vehicular mixed reality (MR) Metaverse, distance between physical and virtual entities can be overcome. Large-scale traffic and driving simulation via realistic data collection and fusion from the physical world is difficult and costly. We propose an autonomous driving architecture, where generative AI is leveraged to synthesize unlimited conditioned traffic and driving data in simulations.
arXiv Detail & Related papers (2023-02-16T16:54:10Z)
SHIFT: A Synthetic Driving Dataset for Continuous Multi-Task Domain Adaptation [152.60469768559878]
SHIFT is the largest multi-task synthetic dataset for autonomous driving. It presents discrete and continuous shifts in cloudiness, rain and fog intensity, time of day, and vehicle and pedestrian density. Our dataset and benchmark toolkit are publicly available at www.vis.xyz/shift.
arXiv Detail & Related papers (2022-06-16T17:59:52Z)
DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis [54.198237164152786]
Vision is the richest and most cost-effective technology for Driver Monitoring Systems (DMS) The lack of sufficiently large and comprehensive datasets is currently a bottleneck for the progress of DMS development. In this paper, we introduce the Driver Monitoring dataset (DMD), an extensive dataset which includes real and simulated driving scenarios.
arXiv Detail & Related papers (2020-08-27T12:33:54Z)
Probabilistic End-to-End Vehicle Navigation in Complex Dynamic Environments with Multimodal Sensor Fusion [16.018962965273495]
All-day and all-weather navigation is a critical capability for autonomous driving. We propose a probabilistic driving model with ultiperception capability utilizing the information from the camera, lidar and radar. The results suggest that our proposed model outperforms baselines and achieves excellent generalization performance in unseen environments.
arXiv Detail & Related papers (2020-05-05T03:48:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.