Related papers: RhythmFormer: Extracting rPPG Signals Based on Hierarchical Temporal Periodic Transformer

RhythmFormer: Extracting rPPG Signals Based on Hierarchical Temporal Periodic Transformer

URL: http://arxiv.org/abs/2402.12788v1
Date: Tue, 20 Feb 2024 07:56:02 GMT
Title: RhythmFormer: Extracting rPPG Signals Based on Hierarchical Temporal Periodic Transformer
Authors: Bochao Zou, Zizheng Guo, Jiansheng Chen, Huimin Ma
Abstract summary: We propose a fully end-to-end transformer-based method for extracting r signals by explicitly leveraging the quasi-periodic nature of r periodicity. A fusion stem is proposed to guide self-attention to r features effectively, and it can be easily transferred to existing methods to enhance their performance significantly.
Score: 17.751885452773983
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Remote photoplethysmography (rPPG) is a non-contact method for detecting physiological signals based on facial videos, holding high potential in various applications such as healthcare, affective computing, anti-spoofing, etc. Due to the periodicity nature of rPPG, the long-range dependency capturing capacity of the Transformer was assumed to be advantageous for such signals. However, existing approaches have not conclusively demonstrated the superior performance of Transformer over traditional convolutional neural network methods, this gap may stem from a lack of thorough exploration of rPPG periodicity. In this paper, we propose RhythmFormer, a fully end-to-end transformer-based method for extracting rPPG signals by explicitly leveraging the quasi-periodic nature of rPPG. The core module, Hierarchical Temporal Periodic Transformer, hierarchically extracts periodic features from multiple temporal scales. It utilizes dynamic sparse attention based on periodicity in the temporal domain, allowing for fine-grained modeling of rPPG features. Furthermore, a fusion stem is proposed to guide self-attention to rPPG features effectively, and it can be easily transferred to existing methods to enhance their performance significantly. RhythmFormer achieves state-of-the-art performance with fewer parameters and reduced computational complexity in comprehensive experiments compared to previous approaches. The codes are available at https://github.com/zizheng-guo/RhythmFormer.

Related papers

A Pre-Training and Adaptive Fine-Tuning Framework for Graph Anomaly Detection [67.77204352386897]
Graph anomaly detection (GAD) has garnered increasing attention in recent years, yet it remains challenging due to the scarcity of abnormal nodes and the high cost of label annotations. We propose PAF, a framework specifically designed for GAD that combines low- and high-pass filters in the pre-training phase to capture the full spectrum of frequency information in node features.
arXiv Detail & Related papers (2025-04-19T09:57:35Z)
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation [63.87956583202729]
We conceptualize attention as a feature map and apply the convolution operator to mimic the processing methods in computer vision. The novel insight, which can be adapted to various attention-related models, reveals that the current Transformer architecture has the potential for further evolution.
arXiv Detail & Related papers (2024-10-07T07:21:49Z)
PRformer: Pyramidal Recurrent Transformer for Multivariate Time Series Forecasting [82.03373838627606]
Self-attention mechanism in Transformer architecture requires positional embeddings to encode temporal order in time series prediction. We argue that this reliance on positional embeddings restricts the Transformer's ability to effectively represent temporal sequences. We present a model integrating PRE with a standard Transformer encoder, demonstrating state-of-the-art performance on various real-world datasets.
arXiv Detail & Related papers (2024-08-20T01:56:07Z)
Reconstructing Richtmyer-Meshkov instabilities from noisy radiographs using low dimensional features and attention-based neural networks [3.6270672925388263]
A trained attention-based transformer network can robustly recover the complex topologies given by the Richtmyer-Meshkoff instability. This approach is demonstrated on ICF-like double shell hydrodynamic simulations.
arXiv Detail & Related papers (2024-08-02T03:02:39Z)
Data-Driven Abstractions via Binary-Tree Gaussian Processes for Formal Verification [0.22499166814992438]
abstraction-based solutions based on Gaussian process (GP) regression have become popular for their ability to learn a representation of the latent system from data with a quantified error. We show that the binary-tree Gaussian process (BTGP) allows us to construct an interval Markov chain model of the unknown system. We provide a delocalized error quantification via a unified formula even when the true dynamics do not live in the function space of the BTGP.
arXiv Detail & Related papers (2024-07-15T11:49:44Z)
RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos [10.132660483466239]
This paper introduces RhythmMamba, an end-to-end method that employs multi-temporal Mamba to constrain both periodic patterns and short-term trends. Extensive experiments show that RhythmMamba state-the-art performance with reduced parameters and lower computational complexity.
arXiv Detail & Related papers (2024-04-09T17:34:19Z)
A Poisson-Gamma Dynamic Factor Model with Time-Varying Transition Dynamics [51.147876395589925]
A non-stationary PGDS is proposed to allow the underlying transition matrices to evolve over time. A fully-conjugate and efficient Gibbs sampler is developed to perform posterior simulation. Experiments show that, in comparison with related models, the proposed non-stationary PGDS achieves improved predictive performance.
arXiv Detail & Related papers (2024-02-26T04:39:01Z)
Refined Temporal Pyramidal Compression-and-Amplification Transformer for 3D Human Pose Estimation [26.61672772233569]
Accurately estimating the 3D pose of humans in video sequences requires both accuracy and a well-structured architecture. We introduce the Refined Temporal Pyramidal Compression-and-Amplification (RTPCA) transformer. We demonstrate the effectiveness of RTPCA by achieving state-of-the-art results on Human3.6M, HumanEva-I, and MPI-INF-3DHP benchmarks.
arXiv Detail & Related papers (2023-09-04T05:25:10Z)
GaitFormer: Revisiting Intrinsic Periodicity for Gait Recognition [6.517046095186713]
Gait recognition aims to distinguish different walking patterns by analyzing video-level human silhouettes, rather than relying on appearance information. Previous research has primarily focused on extracting local or global-temporal representations, while overlooking the intrinsic periodic features of gait sequences. We propose a plug-and-play strategy, called Temporal Periodic Alignment (TPA), which leverages the periodic nature and fine-grained temporal dependencies of gait patterns.
arXiv Detail & Related papers (2023-07-25T05:05:07Z)
Sequential Attention Source Identification Based on Feature Representation [88.05527934953311]
This paper proposes a sequence-to-sequence based localization framework called Temporal-sequence based Graph Attention Source Identification (TGASI) based on an inductive learning idea. It's worth mentioning that the inductive learning idea ensures that TGASI can detect the sources in new scenarios without knowing other prior knowledge.
arXiv Detail & Related papers (2023-06-28T03:00:28Z)
rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for Remote Physiological Measurement [36.54109704201048]
Remote photoplethysmography (r-MAE) is an important technique for perceiving human vital signs. In this paper, we develop a self-supervised framework for extracting inherent self-similar prior in physiological signals. We also evaluate the proposed method on two public datasets, namely PURE and UBFC-r.
arXiv Detail & Related papers (2023-06-04T08:53:28Z)
Diagnostic Spatio-temporal Transformer with Faithful Encoding [54.02712048973161]
This paper addresses the task of anomaly diagnosis when the underlying data generation process has a complex-temporal (ST) dependency. We formalize the problem as supervised dependency discovery, where the ST dependency is learned as a side product of time-series classification. We show that temporal positional encoding used in existing ST transformer works has a serious limitation capturing frequencies in higher frequencies (short time scales) We also propose a new ST dependency discovery framework, which can provide readily consumable diagnostic information in both spatial and temporal directions.
arXiv Detail & Related papers (2023-05-26T05:31:23Z)
Transform Once: Efficient Operator Learning in Frequency Domain [69.74509540521397]
We study deep neural networks designed to harness the structure in frequency domain for efficient learning of long-range correlations in space or time. This work introduces a blueprint for frequency domain learning through a single transform: transform once (T1)
arXiv Detail & Related papers (2022-11-26T01:56:05Z)
WPPG Net: A Non-contact Video Based Heart Rate Extraction Network Framework with Compatible Training Capability [21.33542693986985]
Our facial skin presents subtle color change known as remote Photoplethys (r) signal, from which we could extract the heart rate of the subject. Recently many deep learning methods and related datasets on r signal extraction are proposed. However, because of the time consumption blood flowing through our body and other factors, label waves such as BVP signals have uncertain delays with real r signals in some datasets. In this paper, by analyzing the common characteristics on rhythm and periodicity of r signals and label waves, we propose a whole set of training methodology which wraps these networks so that they could remain efficient when be trained at
arXiv Detail & Related papers (2022-07-04T19:52:30Z)
Adaptive Spike-Like Representation of EEG Signals for Sleep Stages Scoring [6.644008481573341]
We propose an adaptive scheme to encode, filter and accumulate the input signals and the weight features by the half-Gaussian probabilities of signal intensities. Experiments on the largest public dataset against state-of-the-art methods validate the effectiveness of our proposed method and reveal promising future directions.
arXiv Detail & Related papers (2022-04-02T11:21:49Z)
PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields. In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z)
Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them. NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z)
Signal Processing and Machine Learning Techniques for Terahertz Sensing: An Overview [89.09270073549182]
Terahertz (THz) signal generation and radiation methods are shaping the future of wireless systems. THz-specific signal processing techniques should complement this re-surged interest in THz sensing for efficient utilization of the THz band. We present an overview of these techniques, with an emphasis on signal pre-processing. We also address the effectiveness of deep learning techniques by exploring their promising sensing capabilities at the THz band.
arXiv Detail & Related papers (2021-04-09T01:38:34Z)
ADRN: Attention-based Deep Residual Network for Hyperspectral Image Denoising [52.01041506447195]
We propose an attention-based deep residual network to learn a mapping from noisy HSI to the clean one. Experimental results demonstrate that our proposed ADRN scheme outperforms the state-of-the-art methods both in quantitative and visual evaluations.
arXiv Detail & Related papers (2020-03-04T08:36:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.