How Suboptimal is Training rPPG Models with Videos and Targets from Different Body Sites?
- URL: http://arxiv.org/abs/2403.10582v1
- Date: Fri, 15 Mar 2024 15:20:21 GMT
- Title: How Suboptimal is Training rPPG Models with Videos and Targets from Different Body Sites?
- Authors: Björn Braun, Daniel McDuff, Christian Holz,
- Abstract summary: Most current models are trained on facial videos using contact PPG measurements from the fingertip as targets/ labels.
We show that neural models learn to predict the morphology of the ground truth PPG signal better when trained on the forehead.
- Score: 40.527994999118725
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Remote camera measurement of the blood volume pulse via photoplethysmography (rPPG) is a compelling technology for scalable, low-cost, and accessible assessment of cardiovascular information. Neural networks currently provide the state-of-the-art for this task and supervised training or fine-tuning is an important step in creating these models. However, most current models are trained on facial videos using contact PPG measurements from the fingertip as targets/ labels. One of the reasons for this is that few public datasets to date have incorporated contact PPG measurements from the face. Yet there is copious evidence that the PPG signals at different sites on the body have very different morphological features. Is training a facial video rPPG model using contact measurements from another site on the body suboptimal? Using a recently released unique dataset with synchronized contact PPG and video measurements from both the hand and face, we can provide precise and quantitative answers to this question. We obtain up to 40 % lower mean squared errors between the waveforms of the predicted and the ground truth PPG signals using state-of-the-art neural models when using PPG signals from the forehead compared to using PPG signals from the fingertip. We also show qualitatively that the neural models learn to predict the morphology of the ground truth PPG signal better when trained on the forehead PPG signals. However, while models trained from the forehead PPG produce a more faithful waveform, models trained from a finger PPG do still learn the dominant frequency (i.e., the heart rate) well.
Related papers
- Summit Vitals: Multi-Camera and Multi-Signal Biosensing at High Altitudes [22.23531900474421]
Video photoplethysmography is an emerging method for non-invasive and convenient measurement of physiological signals.
This dataset is designed to validate video vitals estimation algorithms and fusing videos from different positions.
Our findings suggest that simultaneous training on multiple indicators, such as PPG and blood oxygen, can reduce MAE in SpO2 estimation by 17.8%.
arXiv Detail & Related papers (2024-09-28T03:36:16Z) - Full-Body Cardiovascular Sensing with Remote Photoplethysmography [4.123458880886283]
Remote photoplethysmography (r) allows for noncontact monitoring of blood volume changes from a camera by detecting minor fluctuations in reflected light.
We explored the feasibility of r from non-face body regions such as the arms, legs, and hands.
arXiv Detail & Related papers (2023-03-16T20:37:07Z) - Contrast-Phys: Unsupervised Video-based Remote Physiological Measurement
via Spatiotemporal Contrast [17.691683039742323]
Video-based remote physiological measurement face videos to measure the blood volume change signal, which is also called remote photoplethysmography (r)
We use a 3DCNN model to generate multiple rtemporal signals from each video in different locations and train the model with a contrastive loss where r signals from the same video are pulled together while those from different videos are pushed away.
arXiv Detail & Related papers (2022-08-08T19:30:57Z) - Identifying Rhythmic Patterns for Face Forgery Detection and
Categorization [46.21354355137544]
We propose a framework for face forgery detection and categorization consisting of: 1) a Spatial-Temporal Filtering Network (STFNet) for PPG signals, and 2) a Spatial-Temporal Interaction Network (STINet) for constraint and interaction of PPG signals.
With insight into the generation of forgery methods, we further propose intra-source and inter-source blending to boost the performance of the framework.
arXiv Detail & Related papers (2022-07-04T04:57:06Z) - PhysFormer: Facial Video-based Physiological Measurement with Temporal
Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z) - Keypoint Message Passing for Video-based Person Re-Identification [106.41022426556776]
Video-based person re-identification (re-ID) is an important technique in visual surveillance systems which aims to match video snippets of people captured by different cameras.
Existing methods are mostly based on convolutional neural networks (CNNs), whose building blocks either process local neighbor pixels at a time, or, when 3D convolutions are used to model temporal information, suffer from the misalignment problem caused by person movement.
In this paper, we propose to overcome the limitations of normal convolutions with a human-oriented graph method. Specifically, features located at person joint keypoints are extracted and connected as a spatial-temporal graph
arXiv Detail & Related papers (2021-11-16T08:01:16Z) - Motion Artifact Reduction In Photoplethysmography For Reliable Signal
Selection [5.264561559435017]
Photoplethysmography ( PPG) is a non-invasive and economical technique to extract vital signs of the human body.
It is sensitive to motion which can corrupt the signal's quality.
It is valuable to collect realistic PPG signals while performing Activities of Daily Living (ADL) to develop practical signal denoising and analysis methods.
arXiv Detail & Related papers (2021-09-06T21:53:56Z) - Assessment of deep learning based blood pressure prediction from PPG and
rPPG signals [2.624902795082451]
This work aims to analyze the PPG- and r-based BP prediction error with respect to the underlying data distribution.
We train established neural network (NN) architectures and derive an appropriate parameterization of input segments drawn from continuous PPG signals.
Second, we apply this parameterization to a larger PPG dataset and train NNs to predict BP.
Third, we use transfer learning to train the NNs for r-based BP prediction. The resulting performances are similar to the PPG-only case.
arXiv Detail & Related papers (2021-04-15T15:56:58Z) - TransRPPG: Remote Photoplethysmography Transformer for 3D Mask Face
Presentation Attack Detection [53.98866801690342]
3D mask face presentation attack detection (PAD) plays a vital role in securing face recognition systems from 3D mask attacks.
We propose a pure r transformer (TransR) framework for learning live intrinsicness representation efficiently.
Our TransR is lightweight and efficient (with only 547K parameters and 763MOPs) which is promising for mobile-level applications.
arXiv Detail & Related papers (2021-04-15T12:33:13Z) - Assessing Graph-based Deep Learning Models for Predicting Flash Point [52.931492216239995]
Graph-based deep learning (GBDL) models were implemented in predicting flash point for the first time.
Average R2 and Mean Absolute Error (MAE) scores of MPNN are, respectively, 2.3% lower and 2.0 K higher than previous comparable studies.
arXiv Detail & Related papers (2020-02-26T06:10:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.