Non-contact PPG Signal and Heart Rate Estimation with Multi-hierarchical
Convolutional Network
- URL: http://arxiv.org/abs/2104.02260v2
- Date: Fri, 21 Apr 2023 15:03:09 GMT
- Title: Non-contact PPG Signal and Heart Rate Estimation with Multi-hierarchical
Convolutional Network
- Authors: Bin Li, Panpan Zhang, Jinye Peng, Hong Fu
- Abstract summary: Heart rate (HR) are important physiological parameters of the human body.
This study presents an efficient multi-archhierical- convolutional network that can estimate HR from face video clips.
- Score: 12.119293125608976
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Heartbeat rhythm and heart rate (HR) are important physiological parameters
of the human body. This study presents an efficient multi-hierarchical
spatio-temporal convolutional network that can quickly estimate remote
physiological (rPPG) signal and HR from face video clips. First, the facial
color distribution characteristics are extracted using a low-level face feature
generation (LFFG) module. Then, the three-dimensional (3D) spatio-temporal
stack convolution module (STSC) and multi-hierarchical feature fusion module
(MHFF) are used to strengthen the spatio-temporal correlation of multi-channel
features. In the MHFF, sparse optical flow is used to capture the tiny motion
information of faces between frames and generate a self-adaptive region of
interest (ROI) skin mask. Finally, the signal prediction module (SP) is used to
extract the estimated rPPG signal. The heart rate estimation results show that
the proposed network overperforms the state-of-the-art methods on three
datasets, 1) UBFC-RPPG, 2) COHFACE, 3) our dataset, with the mean absolute
error (MAE) of 2.15, 5.57, 1.75 beats per minute (bpm) respectively.
Related papers
- FactorizePhys: Matrix Factorization for Multidimensional Attention in Remote Physiological Sensing [10.81951503398909]
Factorized Self-Attention Module (FSAM) computes multidimensional attention from voxel embeddings using nonnegative matrix factorization.
Our approach adeptly factorizes voxel embeddings to achieve comprehensive spatial, temporal, and channel attention, enhancing performance of generic signal extraction.
FactorizePhys is an end-to-end 3D-CNN architecture for estimating blood volume pulse signals from raw video frames.
arXiv Detail & Related papers (2024-11-03T12:22:58Z) - RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory Waveform Estimation from PPG Signals [3.306437812367815]
We propose RespDiff, an end-to-end multi-scale RNN model for respiratory waveform estimation from PPG signals.
The model employs multi-scale encoders, to extract features at different resolutions, and a bidirectional RNN to process PPG signals and extract respiratory waveform.
Experiments conducted on the BIDMC dataset demonstrate that RespDiff outperforms notable previous works, achieving a mean absolute error (MAE) of 1.18 bpm for RR estimation.
arXiv Detail & Related papers (2024-10-06T05:54:49Z) - PhysFormer++: Facial Video-based Physiological Measurement with SlowFast
Temporal Difference Transformer [76.40106756572644]
Recent deep learning approaches focus on mining subtle clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose two end-to-end video transformer based on PhysFormer and Phys++++, to adaptively aggregate both local and global features for r representation enhancement.
Comprehensive experiments are performed on four benchmark datasets to show our superior performance on both intra-temporal and cross-dataset testing.
arXiv Detail & Related papers (2023-02-07T15:56:03Z) - Multi-Head Cross-Attentional PPG and Motion Signal Fusion for Heart Rate
Estimation [2.839269856680851]
We present a new deep learning model, PULSE, which exploits temporal convolutions and multi-head cross-attention to improve sensor fusion's effectiveness.
We evaluate the performance of PULSE on three publicly available datasets, reducing the mean absolute error by 7.56%.
arXiv Detail & Related papers (2022-10-14T08:07:53Z) - MixSTE: Seq2seq Mixed Spatio-Temporal Encoder for 3D Human Pose
Estimation in Video [75.23812405203778]
Recent solutions have been introduced to estimate 3D human pose from 2D keypoint sequence by considering body joints among all frames globally to learn-temporal correlation.
We propose Mix Mix, which has temporal transformer block to separately model the temporal motion of each joint and a transformer block inter-joint spatial correlation.
In addition, the network output is extended from the central frame to entire frames of input video, improving the coherence between the input and output benchmarks.
arXiv Detail & Related papers (2022-03-02T04:20:59Z) - PhysFormer: Facial Video-based Physiological Measurement with Temporal
Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z) - Video-based Remote Physiological Measurement via Cross-verified Feature
Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations.
We then use the distilled physiological features for robust multi-task physiological measurements.
The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z) - AutoHR: A Strong End-to-end Baseline for Remote Heart Rate Measurement
with Neural Searching [76.4844593082362]
We investigate the reason why existing end-to-end networks perform poorly in challenging conditions and establish a strong baseline for remote HR measurement with architecture search (NAS)
Comprehensive experiments are performed on three benchmark datasets on both intra-temporal and cross-dataset testing.
arXiv Detail & Related papers (2020-04-26T05:43:21Z) - FPCR-Net: Feature Pyramidal Correlation and Residual Reconstruction for
Optical Flow Estimation [72.41370576242116]
We propose a semi-supervised Feature Pyramidal Correlation and Residual Reconstruction Network (FPCR-Net) for optical flow estimation from frame pairs.
It consists of two main modules: pyramid correlation mapping and residual reconstruction.
Experiment results show that the proposed scheme achieves the state-of-the-art performance, with improvement by 0.80, 1.15 and 0.10 in terms of average end-point error (AEE) against competing baseline methods.
arXiv Detail & Related papers (2020-01-17T07:13:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.