A Plug-and-Play Temporal Normalization Module for Robust Remote Photoplethysmography
- URL: http://arxiv.org/abs/2411.15283v1
- Date: Fri, 22 Nov 2024 16:32:49 GMT
- Title: A Plug-and-Play Temporal Normalization Module for Robust Remote Photoplethysmography
- Authors: Kegang Wang, Jiankai Tang, Yantao Wei, Mingxuan Liu, Xin Liu, Yuntao Wang,
- Abstract summary: Remote photoplethysmography (r) extracts PPG signals from subtle color changes in facial videos.
Most r methods rely on intensity differences between consecutive frames, missing long-term signal variations affected by motion or lighting artifacts.
This paper introduces Temporal Normalization (TN), a flexible plug-and-play module with any end-to-end r network architecture.
- Score: 8.164518942155246
- License:
- Abstract: Remote photoplethysmography (rPPG) extracts PPG signals from subtle color changes in facial videos, showing strong potential for health applications. However, most rPPG methods rely on intensity differences between consecutive frames, missing long-term signal variations affected by motion or lighting artifacts, which reduces accuracy. This paper introduces Temporal Normalization (TN), a flexible plug-and-play module compatible with any end-to-end rPPG network architecture. By capturing long-term temporally normalized features following detrending, TN effectively mitigates motion and lighting artifacts, significantly boosting the rPPG prediction performance. When integrated into four state-of-the-art rPPG methods, TN delivered performance improvements ranging from 34.3% to 94.2% in heart rate measurement tasks across four widely-used datasets. Notably, TN showed even greater performance gains in smaller models. We further discuss and provide insights into the mechanisms behind TN's effectiveness.
Related papers
- RhythmFormer: Extracting rPPG Signals Based on Hierarchical Temporal
Periodic Transformer [17.751885452773983]
We propose a fully end-to-end transformer-based method for extracting r signals by explicitly leveraging the quasi-periodic nature of r periodicity.
A fusion stem is proposed to guide self-attention to r features effectively, and it can be easily transferred to existing methods to enhance their performance significantly.
arXiv Detail & Related papers (2024-02-20T07:56:02Z) - Tiny-PPG: A Lightweight Deep Neural Network for Real-Time Detection of
Motion Artifacts in Photoplethysmogram Signals on Edge Devices [6.352499671581954]
Photoplethysmogram signals are easily contaminated by motion artifacts in real-world settings.
This study proposed a lightweight deep neural network, called Tiny-edge, for accurate and real-time PPG artifact segmentation on IoT devices.
Tiny-edge was successfully deployed on an STM32 embedded system for real-time PPG artifact detection.
arXiv Detail & Related papers (2023-05-05T06:17:57Z) - PhysFormer++: Facial Video-based Physiological Measurement with SlowFast
Temporal Difference Transformer [76.40106756572644]
Recent deep learning approaches focus on mining subtle clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose two end-to-end video transformer based on PhysFormer and Phys++++, to adaptively aggregate both local and global features for r representation enhancement.
Comprehensive experiments are performed on four benchmark datasets to show our superior performance on both intra-temporal and cross-dataset testing.
arXiv Detail & Related papers (2023-02-07T15:56:03Z) - Learning Motion-Robust Remote Photoplethysmography through Arbitrary
Resolution Videos [31.512551653273373]
In the real-world long-term health monitoring scenario, the distance of participants and their head movements usually vary by time, resulting in the inaccurate r measurement.
Different from the previous r models designed for a constant distance between camera and participants, in this paper, we propose two plug-and-play blocks (i.e., physiological signal feature extraction block (PFE) and temporal face alignment block (TFA)) to alleviate the degradation of changing distance and head motion.
arXiv Detail & Related papers (2022-11-30T11:50:08Z) - Multi-Head Cross-Attentional PPG and Motion Signal Fusion for Heart Rate
Estimation [2.839269856680851]
We present a new deep learning model, PULSE, which exploits temporal convolutions and multi-head cross-attention to improve sensor fusion's effectiveness.
We evaluate the performance of PULSE on three publicly available datasets, reducing the mean absolute error by 7.56%.
arXiv Detail & Related papers (2022-10-14T08:07:53Z) - FAMLP: A Frequency-Aware MLP-Like Architecture For Domain Generalization [73.41395947275473]
We propose a novel frequency-aware architecture, in which the domain-specific features are filtered out in the transformed frequency domain.
Experiments on three benchmarks demonstrate significant performance, outperforming the state-of-the-art methods by a margin of 3%, 4% and 9%, respectively.
arXiv Detail & Related papers (2022-03-24T07:26:29Z) - PhysFormer: Facial Video-based Physiological Measurement with Temporal
Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z) - Multiscale Spatio-Temporal Graph Neural Networks for 3D Skeleton-Based
Motion Prediction [92.16318571149553]
We propose a multiscale-temporal graph neural network (MST-GNN) to predict the future 3D-based skeleton human poses.
The MST-GNN outperforms state-of-the-art methods in both short and long-term motion prediction.
arXiv Detail & Related papers (2021-08-25T14:05:37Z) - TransRPPG: Remote Photoplethysmography Transformer for 3D Mask Face
Presentation Attack Detection [53.98866801690342]
3D mask face presentation attack detection (PAD) plays a vital role in securing face recognition systems from 3D mask attacks.
We propose a pure r transformer (TransR) framework for learning live intrinsicness representation efficiently.
Our TransR is lightweight and efficient (with only 547K parameters and 763MOPs) which is promising for mobile-level applications.
arXiv Detail & Related papers (2021-04-15T12:33:13Z) - Non-contact PPG Signal and Heart Rate Estimation with Multi-hierarchical
Convolutional Network [12.119293125608976]
Heart rate (HR) are important physiological parameters of the human body.
This study presents an efficient multi-archhierical- convolutional network that can estimate HR from face video clips.
arXiv Detail & Related papers (2021-04-06T03:04:27Z) - Temporal Pyramid Network for Action Recognition [129.12076009042622]
We propose a generic Temporal Pyramid Network (TPN) at the feature-level, which can be flexibly integrated into 2D or 3D backbone networks.
TPN shows consistent improvements over other challenging baselines on several action recognition datasets.
arXiv Detail & Related papers (2020-04-07T17:17:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.