RhythmFormer: Extracting Patterned rPPG Signals based on Periodic Sparse Attention
- URL: http://arxiv.org/abs/2402.12788v3
- Date: Thu, 20 Feb 2025 12:02:11 GMT
- Title: RhythmFormer: Extracting Patterned rPPG Signals based on Periodic Sparse Attention
- Authors: Bochao Zou, Zizheng Guo, Jiansheng Chen, Junbao Zhuo, Weiran Huang, Huimin Ma,
- Abstract summary: RRhythm is a non-contact method for detecting physiological signals based on physiological videos.
This paper proposes a periodic attention mechanism based on temporal attention sparsity induced by periodicity.
It achieves state-of-the-art performance in both intra-dataset and cross-dataset evaluations.
- Score: 18.412642801957197
- License:
- Abstract: Remote photoplethysmography (rPPG) is a non-contact method for detecting physiological signals based on facial videos, holding high potential in various applications. Due to the periodicity nature of rPPG signals, the long-range dependency capturing capacity of the transformer was assumed to be advantageous for such signals. However, existing methods have not conclusively demonstrated the superior performance of transformers over traditional convolutional neural networks. This may be attributed to the quadratic scaling exhibited by transformer with sequence length, resulting in coarse-grained feature extraction, which in turn affects robustness and generalization. To address that, this paper proposes a periodic sparse attention mechanism based on temporal attention sparsity induced by periodicity. A pre-attention stage is introduced before the conventional attention mechanism. This stage learns periodic patterns to filter out a large number of irrelevant attention computations, thus enabling fine-grained feature extraction. Moreover, to address the issue of fine-grained features being more susceptible to noise interference, a fusion stem is proposed to effectively guide self-attention towards rPPG features. It can be easily integrated into existing methods to enhance their performance. Extensive experiments show that the proposed method achieves state-of-the-art performance in both intra-dataset and cross-dataset evaluations. The codes are available at https://github.com/zizheng-guo/RhythmFormer.
Related papers
- DAPE V2: Process Attention Score as Feature Map for Length Extrapolation [63.87956583202729]
We conceptualize attention as a feature map and apply the convolution operator to mimic the processing methods in computer vision.
The novel insight, which can be adapted to various attention-related models, reveals that the current Transformer architecture has the potential for further evolution.
arXiv Detail & Related papers (2024-10-07T07:21:49Z) - Reconstructing Richtmyer-Meshkov instabilities from noisy radiographs using low dimensional features and attention-based neural networks [3.6270672925388263]
A trained attention-based transformer network can robustly recover the complex topologies given by the Richtmyer-Meshkoff instability.
This approach is demonstrated on ICF-like double shell hydrodynamic simulations.
arXiv Detail & Related papers (2024-08-02T03:02:39Z) - GaitFormer: Revisiting Intrinsic Periodicity for Gait Recognition [6.517046095186713]
Gait recognition aims to distinguish different walking patterns by analyzing video-level human silhouettes, rather than relying on appearance information.
Previous research has primarily focused on extracting local or global-temporal representations, while overlooking the intrinsic periodic features of gait sequences.
We propose a plug-and-play strategy, called Temporal Periodic Alignment (TPA), which leverages the periodic nature and fine-grained temporal dependencies of gait patterns.
arXiv Detail & Related papers (2023-07-25T05:05:07Z) - Sequential Attention Source Identification Based on Feature
Representation [88.05527934953311]
This paper proposes a sequence-to-sequence based localization framework called Temporal-sequence based Graph Attention Source Identification (TGASI) based on an inductive learning idea.
It's worth mentioning that the inductive learning idea ensures that TGASI can detect the sources in new scenarios without knowing other prior knowledge.
arXiv Detail & Related papers (2023-06-28T03:00:28Z) - rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for
Remote Physiological Measurement [36.54109704201048]
Remote photoplethysmography (r-MAE) is an important technique for perceiving human vital signs.
In this paper, we develop a self-supervised framework for extracting inherent self-similar prior in physiological signals.
We also evaluate the proposed method on two public datasets, namely PURE and UBFC-r.
arXiv Detail & Related papers (2023-06-04T08:53:28Z) - Adaptive Spike-Like Representation of EEG Signals for Sleep Stages
Scoring [6.644008481573341]
We propose an adaptive scheme to encode, filter and accumulate the input signals and the weight features by the half-Gaussian probabilities of signal intensities.
Experiments on the largest public dataset against state-of-the-art methods validate the effectiveness of our proposed method and reveal promising future directions.
arXiv Detail & Related papers (2022-04-02T11:21:49Z) - PhysFormer: Facial Video-based Physiological Measurement with Temporal
Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z) - Signal Processing and Machine Learning Techniques for Terahertz Sensing:
An Overview [89.09270073549182]
Terahertz (THz) signal generation and radiation methods are shaping the future of wireless systems.
THz-specific signal processing techniques should complement this re-surged interest in THz sensing for efficient utilization of the THz band.
We present an overview of these techniques, with an emphasis on signal pre-processing.
We also address the effectiveness of deep learning techniques by exploring their promising sensing capabilities at the THz band.
arXiv Detail & Related papers (2021-04-09T01:38:34Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z) - ADRN: Attention-based Deep Residual Network for Hyperspectral Image
Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one.
Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.