Related papers: The Benefit of Distraction: Denoising Remote Vitals Measurements using Inverse Attention

The Benefit of Distraction: Denoising Remote Vitals Measurements using Inverse Attention

URL: http://arxiv.org/abs/2010.07770v1
Date: Wed, 14 Oct 2020 13:51:33 GMT
Title: The Benefit of Distraction: Denoising Remote Vitals Measurements using Inverse Attention
Authors: Ewa Nowara, Daniel McDuff, Ashok Veeraraghavan
Abstract summary: We present an approach that exploits the idea that statistics of noise may be shared between the regions that contain the signal of interest. Our technique uses the inverse of an attention mask to generate a noise estimate that is then used to denoise temporal observations. We show that this approach produces state-of-the-art results, increasing the signal-to-noise ratio by up to 5.8 dB.
Score: 25.285955440420594
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Attention is a powerful concept in computer vision. End-to-end networks that learn to focus selectively on regions of an image or video often perform strongly. However, other image regions, while not necessarily containing the signal of interest, may contain useful context. We present an approach that exploits the idea that statistics of noise may be shared between the regions that contain the signal of interest and those that do not. Our technique uses the inverse of an attention mask to generate a noise estimate that is then used to denoise temporal observations. We apply this to the task of camera-based physiological measurement. A convolutional attention network is used to learn which regions of a video contain the physiological signal and generate a preliminary estimate. A noise estimate is obtained by using the pixel intensities in the inverse regions of the learned attention mask, this in turn is used to refine the estimate of the physiological signal. We perform experiments on two large benchmark datasets and show that this approach produces state-of-the-art results, increasing the signal-to-noise ratio by up to 5.8 dB, reducing heart rate and breathing rate estimation error by as much as 30%, recovering subtle pulse waveform dynamics, and generalizing from RGB to NIR videos without retraining.

Related papers

Time-Series U-Net with Recurrence for Noise-Robust Imaging Photoplethysmography [14.749406169315554]
Photoplethysmography system consists of three modules: face and landmark detection, time-series extraction, and pulse signal/pulse rate estimation. The pulse signal estimation module, which we call TURNIP, allows the system to faithfully reconstruct the underlying pulse signal waveform. Our algorithm provides reliable heart rate estimates without the need for specialized sensors or contact with the skin.
arXiv Detail & Related papers (2025-03-21T17:52:33Z)
Recovering Pulse Waves from Video Using Deep Unrolling and Deep Equilibrium Models [45.94962431110573]
Camera-based monitoring of vital signs, also known as imaging photoplethysmography (i), has seen applications in driver-monitoring, affective computing, and more. We introduce methods that combine signal processing and deep learning methods in an inverse problem.
arXiv Detail & Related papers (2025-03-21T16:11:21Z)
Learning to Enhance Aperture Phasor Field for Non-Line-of-Sight Imaging [22.365437882740657]
This paper aims to facilitate more practical NLOS imaging by reducing the number of samplings and scan areas. We leverage a denoising autoencoder scheme to acquire rich and noise-robust representations in the measurement space. We introduce a phasor-based pipeline designed to limit the spectrum of our network to the frequency range of interests.
arXiv Detail & Related papers (2024-07-26T07:57:07Z)
Wavelet-based Bi-dimensional Aggregation Network for SAR Image Change Detection [53.842568573251214]
Experimental results on three SAR datasets demonstrate that our WBANet significantly outperforms contemporary state-of-the-art methods. Our WBANet achieves 98.33%, 96.65%, and 96.62% of percentage of correct classification (PCC) on the respective datasets.
arXiv Detail & Related papers (2024-07-18T04:36:10Z)
Rethinking Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising [94.09442506816724]
Blind-spot networks (BSN) have been prevalent neural architectures in self-supervised image denoising (SSID) We build a Transformer-based Blind-Spot Network (TBSN) which shows strong local fitting and global perspective abilities.
arXiv Detail & Related papers (2024-04-11T15:39:10Z)
Compute-first optical detection for noise-resilient visual perception [0.5325390073522079]
We propose a concept of optical signal processing before detection to address this issue. We demonstrate that spatially redistributing optical signals through a properly designed linear transformer can enhance the detection noise resilience of visual perception tasks. This compute-first detection scheme can pave the way for advancing infrared machine vision technologies widely used for industrial and defense applications.
arXiv Detail & Related papers (2024-03-14T17:51:38Z)
Differentiable Frequency-based Disentanglement for Aerial Video Action Recognition [56.91538445510214]
We present a learning algorithm for human activity recognition in videos. Our approach is designed for UAV videos, which are mainly acquired from obliquely placed dynamic cameras. We conduct extensive experiments on the UAV Human dataset and the NEC Drone dataset.
arXiv Detail & Related papers (2022-09-15T22:16:52Z)
DRNet: Decomposition and Reconstruction Network for Remote Physiological Measurement [39.73408626273354]
Existing methods are generally divided into two groups. The first focuses on mining the subtle volume pulse (BVP) signals from face videos, but seldom explicitly models the noises that dominate face video content. The second focuses on modeling noisy data directly, resulting in suboptimal performance due to the lack of regularity of these severe random noises.
arXiv Detail & Related papers (2022-06-12T07:40:10Z)
Zero-shot Blind Image Denoising via Implicit Neural Representations [77.79032012459243]
We propose an alternative denoising strategy that leverages the architectural inductive bias of implicit neural representations (INRs) We show that our method outperforms existing zero-shot denoising methods under an extensive set of low-noise or real-noise scenarios.
arXiv Detail & Related papers (2022-04-05T12:46:36Z)
Deep Impulse Responses: Estimating and Parameterizing Filters with Deep Networks [76.830358429947]
Impulse response estimation in high noise and in-the-wild settings is a challenging problem. We propose a novel framework for parameterizing and estimating impulse responses based on recent advances in neural representation learning.
arXiv Detail & Related papers (2022-02-07T18:57:23Z)
Exploring Inter-frequency Guidance of Image for Lightweight Gaussian Denoising [1.52292571922932]
We propose a novel network architecture denoted as IGNet, in order to refine the frequency bands from low to high in a progressive manner. With this design, more inter-frequency prior and information are utilized, thus the model size can be lightened while still perserves competitive results.
arXiv Detail & Related papers (2021-12-22T10:35:53Z)
Amplitude-Phase Recombination: Rethinking Robustness of Convolutional Neural Networks in Frequency Domain [31.182376196295365]
CNN tends to converge at the local optimum which is closely related to the high-frequency components of the training images. A new perspective on data augmentation designed by re-combing the phase spectrum of the current image and the amplitude spectrum of the distracter image.
arXiv Detail & Related papers (2021-08-19T04:04:41Z)
Spatial-Phase Shallow Learning: Rethinking Face Forgery Detection in Frequency Domain [88.7339322596758]
We present a novel Spatial-Phase Shallow Learning (SPSL) method, which combines spatial image and phase spectrum to capture the up-sampling artifacts of face forgery. SPSL can achieve the state-of-the-art performance on cross-datasets evaluation as well as multi-class classification and obtain comparable results on single dataset evaluation.
arXiv Detail & Related papers (2021-03-02T16:45:08Z)
ADRN: Attention-based Deep Residual Network for Hyperspectral Image Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one. Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.