Non-Contact NIR PPG Sensing through Large Sequence Signal Regression
- URL: http://arxiv.org/abs/2311.11757v1
- Date: Mon, 20 Nov 2023 13:34:51 GMT
- Title: Non-Contact NIR PPG Sensing through Large Sequence Signal Regression
- Authors: Timothy Hanley, Dara Golden, Robyn Maxwell, Ashkan Parsi, Joseph
Lemley
- Abstract summary: Non-Contact sensing is an emerging technology with applications across many industries from driver monitoring in vehicles to patient monitoring in healthcare.
Current state-of-the-art focus on RGB video, but this struggles in varying/noisy light conditions and is almost completely unfeasible in the dark. Near Infra-Red (NIR) video, however, does not suffer from these constraints.
This paper aims to demonstrate the effectiveness of an alternative Convolution Attention Network (CAN) architecture, to regress photoplethysmography (NIR) signal from a sequence of NIR frames.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Non-Contact sensing is an emerging technology with applications across many
industries from driver monitoring in vehicles to patient monitoring in
healthcare. Current state-of-the-art implementations focus on RGB video, but
this struggles in varying/noisy light conditions and is almost completely
unfeasible in the dark. Near Infra-Red (NIR) video, however, does not suffer
from these constraints. This paper aims to demonstrate the effectiveness of an
alternative Convolution Attention Network (CAN) architecture, to regress
photoplethysmography (PPG) signal from a sequence of NIR frames. A combination
of two publicly available datasets, which is split into train and test sets, is
used for training the CAN. This combined dataset is augmented to reduce
overfitting to the 'normal' 60 - 80 bpm heart rate range by providing the full
range of heart rates along with corresponding videos for each subject. This
CAN, when implemented over video cropped to the subject's head, achieved a Mean
Average Error (MAE) of just 0.99 bpm, proving its effectiveness on NIR video
and the architecture's feasibility to regress an accurate signal output.
Related papers
- SiNC+: Adaptive Camera-Based Vitals with Unsupervised Learning of Periodic Signals [6.458510829614774]
We present the first non-contrastive unsupervised learning framework for signal regression.
We find that encouraging sparse power spectra within normal physiological bandlimits and variance over batches of power spectra is sufficient for learning periodic signals.
arXiv Detail & Related papers (2024-04-20T19:17:40Z) - Temporal Shift -- Multi-Objective Loss Function for Improved Anomaly
Fall Detection [3.813649699234981]
We propose a new multi-objective loss function called Temporal Shift, which aims to predict both future and reconstructed frames within a window of sequential frames.
With significant improvement across different models, this approach has the potential to be widely adopted and improve anomaly detection capabilities in other settings besides fall detection.
arXiv Detail & Related papers (2023-11-06T04:29:12Z) - Unsupervised Denoising for Signal-Dependent and Row-Correlated Imaging Noise [54.0185721303932]
We present the first fully unsupervised deep learning-based denoiser capable of handling imaging noise that is row-correlated.
Our approach uses a Variational Autoencoder with a specially designed autoregressive decoder.
Our method does not require a pre-trained noise model and can be trained from scratch using unpaired noisy data.
arXiv Detail & Related papers (2023-10-11T20:48:20Z) - Contrast-Phys+: Unsupervised and Weakly-supervised Video-based Remote
Physiological Measurement via Spatiotemporal Contrast [22.742875409103164]
We propose Contrast-Phys+, a method that can be trained in both unsupervised and unsupervised settings.
We employ a 3DCNN model to generate multiple rtemporal signals and incorporate prior knowledge of r into a contrastive loss function.
Contrast-Phys+ outperforms the state-of-the-art supervised methods, even when using partially available or misaligned GT signals.
arXiv Detail & Related papers (2023-09-13T12:50:21Z) - Progressive Fourier Neural Representation for Sequential Video
Compilation [75.43041679717376]
Motivated by continual learning, this work investigates how to accumulate and transfer neural implicit representations for multiple complex video data over sequential encoding sessions.
We propose a novel method, Progressive Fourier Neural Representation (PFNR), that aims to find an adaptive and compact sub-module in Fourier space to encode videos in each training session.
We validate our PFNR method on the UVG8/17 and DAVIS50 video sequence benchmarks and achieve impressive performance gains over strong continual learning baselines.
arXiv Detail & Related papers (2023-06-20T06:02:19Z) - Non-Contrastive Unsupervised Learning of Physiological Signals from
Video [4.8327232174895745]
We present the first non-contrastive unsupervised learning framework for signal regression to break free from labelled video data.
With minimal assumptions of periodicity and finite bandwidth, our approach is capable of discovering blood volume pulse directly from unlabelled videos.
arXiv Detail & Related papers (2023-03-14T14:34:51Z) - Scalable Neural Video Representations with Learnable Positional Features [73.51591757726493]
We show how to train neural representations with learnable positional features (NVP) that effectively amortize a video as latent codes.
We demonstrate the superiority of NVP on the popular UVG benchmark; compared with prior arts, NVP not only trains 2 times faster (less than 5 minutes) but also exceeds their encoding quality as 34.07rightarrow$34.57 (measured with the PSNR metric)
arXiv Detail & Related papers (2022-10-13T08:15:08Z) - Spatial-Temporal Frequency Forgery Clue for Video Forgery Detection in
VIS and NIR Scenario [87.72258480670627]
Existing face forgery detection methods based on frequency domain find that the GAN forged images have obvious grid-like visual artifacts in the frequency spectrum compared to the real images.
This paper proposes a Cosine Transform-based Forgery Clue Augmentation Network (FCAN-DCT) to achieve a more comprehensive spatial-temporal feature representation.
arXiv Detail & Related papers (2022-07-05T09:27:53Z) - MFGNet: Dynamic Modality-Aware Filter Generation for RGB-T Tracking [72.65494220685525]
We propose a new dynamic modality-aware filter generation module (named MFGNet) to boost the message communication between visible and thermal data.
We generate dynamic modality-aware filters with two independent networks. The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively.
To address issues caused by heavy occlusion, fast motion, and out-of-view, we propose to conduct a joint local and global search by exploiting a new direction-aware target-driven attention mechanism.
arXiv Detail & Related papers (2021-07-22T03:10:51Z) - Learning for Video Compression with Recurrent Auto-Encoder and Recurrent
Probability Model [164.7489982837475]
This paper proposes a Recurrent Learned Video Compression (RLVC) approach with the Recurrent Auto-Encoder (RAE) and Recurrent Probability Model ( RPM)
The RAE employs recurrent cells in both the encoder and decoder to exploit the temporal correlation among video frames.
Our approach achieves the state-of-the-art learned video compression performance in terms of both PSNR and MS-SSIM.
arXiv Detail & Related papers (2020-06-24T08:46:33Z) - Infrared and 3D skeleton feature fusion for RGB-D action recognition [0.30458514384586394]
We propose a modular network combining skeleton and infrared data.
A 2D convolutional network (CNN) is used as a pose module to extract features from skeleton data.
A 3D CNN is used as an infrared module to extract visual cues from videos.
arXiv Detail & Related papers (2020-02-28T17:42:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.