RhythmFormer: Extracting rPPG Signals Based on Hierarchical Temporal
Periodic Transformer
- URL: http://arxiv.org/abs/2402.12788v1
- Date: Tue, 20 Feb 2024 07:56:02 GMT
- Title: RhythmFormer: Extracting rPPG Signals Based on Hierarchical Temporal
Periodic Transformer
- Authors: Bochao Zou, Zizheng Guo, Jiansheng Chen, Huimin Ma
- Abstract summary: We propose a fully end-to-end transformer-based method for extracting r signals by explicitly leveraging the quasi-periodic nature of r periodicity.
A fusion stem is proposed to guide self-attention to r features effectively, and it can be easily transferred to existing methods to enhance their performance significantly.
- Score: 17.751885452773983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Remote photoplethysmography (rPPG) is a non-contact method for detecting
physiological signals based on facial videos, holding high potential in various
applications such as healthcare, affective computing, anti-spoofing, etc. Due
to the periodicity nature of rPPG, the long-range dependency capturing capacity
of the Transformer was assumed to be advantageous for such signals. However,
existing approaches have not conclusively demonstrated the superior performance
of Transformer over traditional convolutional neural network methods, this gap
may stem from a lack of thorough exploration of rPPG periodicity. In this
paper, we propose RhythmFormer, a fully end-to-end transformer-based method for
extracting rPPG signals by explicitly leveraging the quasi-periodic nature of
rPPG. The core module, Hierarchical Temporal Periodic Transformer,
hierarchically extracts periodic features from multiple temporal scales. It
utilizes dynamic sparse attention based on periodicity in the temporal domain,
allowing for fine-grained modeling of rPPG features. Furthermore, a fusion stem
is proposed to guide self-attention to rPPG features effectively, and it can be
easily transferred to existing methods to enhance their performance
significantly. RhythmFormer achieves state-of-the-art performance with fewer
parameters and reduced computational complexity in comprehensive experiments
compared to previous approaches. The codes are available at
https://github.com/zizheng-guo/RhythmFormer.
Related papers
- DAPE V2: Process Attention Score as Feature Map for Length Extrapolation [63.87956583202729]
We conceptualize attention as a feature map and apply the convolution operator to mimic the processing methods in computer vision.
The novel insight, which can be adapted to various attention-related models, reveals that the current Transformer architecture has the potential for further evolution.
arXiv Detail & Related papers (2024-10-07T07:21:49Z) - Reconstructing Richtmyer-Meshkov instabilities from noisy radiographs using low dimensional features and attention-based neural networks [3.6270672925388263]
A trained attention-based transformer network can robustly recover the complex topologies given by the Richtmyer-Meshkoff instability.
This approach is demonstrated on ICF-like double shell hydrodynamic simulations.
arXiv Detail & Related papers (2024-08-02T03:02:39Z) - GaitFormer: Revisiting Intrinsic Periodicity for Gait Recognition [6.517046095186713]
Gait recognition aims to distinguish different walking patterns by analyzing video-level human silhouettes, rather than relying on appearance information.
Previous research has primarily focused on extracting local or global-temporal representations, while overlooking the intrinsic periodic features of gait sequences.
We propose a plug-and-play strategy, called Temporal Periodic Alignment (TPA), which leverages the periodic nature and fine-grained temporal dependencies of gait patterns.
arXiv Detail & Related papers (2023-07-25T05:05:07Z) - Sequential Attention Source Identification Based on Feature
Representation [88.05527934953311]
This paper proposes a sequence-to-sequence based localization framework called Temporal-sequence based Graph Attention Source Identification (TGASI) based on an inductive learning idea.
It's worth mentioning that the inductive learning idea ensures that TGASI can detect the sources in new scenarios without knowing other prior knowledge.
arXiv Detail & Related papers (2023-06-28T03:00:28Z) - rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for
Remote Physiological Measurement [36.54109704201048]
Remote photoplethysmography (r-MAE) is an important technique for perceiving human vital signs.
In this paper, we develop a self-supervised framework for extracting inherent self-similar prior in physiological signals.
We also evaluate the proposed method on two public datasets, namely PURE and UBFC-r.
arXiv Detail & Related papers (2023-06-04T08:53:28Z) - Adaptive Spike-Like Representation of EEG Signals for Sleep Stages
Scoring [6.644008481573341]
We propose an adaptive scheme to encode, filter and accumulate the input signals and the weight features by the half-Gaussian probabilities of signal intensities.
Experiments on the largest public dataset against state-of-the-art methods validate the effectiveness of our proposed method and reveal promising future directions.
arXiv Detail & Related papers (2022-04-02T11:21:49Z) - PhysFormer: Facial Video-based Physiological Measurement with Temporal
Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z) - Signal Processing and Machine Learning Techniques for Terahertz Sensing:
An Overview [89.09270073549182]
Terahertz (THz) signal generation and radiation methods are shaping the future of wireless systems.
THz-specific signal processing techniques should complement this re-surged interest in THz sensing for efficient utilization of the THz band.
We present an overview of these techniques, with an emphasis on signal pre-processing.
We also address the effectiveness of deep learning techniques by exploring their promising sensing capabilities at the THz band.
arXiv Detail & Related papers (2021-04-09T01:38:34Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z) - ADRN: Attention-based Deep Residual Network for Hyperspectral Image
Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one.
Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.