PhysMamba: State Space Duality Model for Remote Physiological Measurement
- URL: http://arxiv.org/abs/2408.01077v2
- Date: Sat, 17 Aug 2024 12:52:03 GMT
- Title: PhysMamba: State Space Duality Model for Remote Physiological Measurement
- Authors: Zhixin Yan, Yan Zhong, Hongbin Xu, Wenjun Zhang, Lin Shu, Hongbin Xu, Wenxiong Kang,
- Abstract summary: Remote Photoplethysmography (rBFC) is used in applications like emotion monitoring, medical assistance, and anti-face spoofing.
Unlike controlled laboratory settings, real-world environments often contain motion artifacts and noise.
We propose PhysMamba, a dual-Path-frequency model via State Space Duality.
This method allows the network to learn richer, more representative features, enhancing robustness in noisy conditions.
- Score: 20.441281420017656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Remote Photoplethysmography (rPPG) is a non-contact technique for extracting physiological signals from facial videos, used in applications like emotion monitoring, medical assistance, and anti-face spoofing. Unlike controlled laboratory settings, real-world environments often contain motion artifacts and noise, affecting the performance of existing rPPG methods. To address this, we propose PhysMamba, a dual-Pathway time-frequency interaction model via State Space Duality. This method allows the network to learn richer, more representative features, enhancing robustness in noisy conditions. To facilitate information exchange and feature complementation between the two pathways, we design an improved algorithm: Cross-Attention State Space Duality (CASSD). We conduct comparative experiments on the PURE, UBFC-rPPG, and MMPD datasets. Experimental results show that PhysMamba achieves state-of-the-art performance, particularly in complex environments, demonstrating its potential in practical remote physiological signal measurement applications.
Related papers
- Multimodal Physiological Signals Representation Learning via Multiscale Contrasting for Depression Recognition [18.65975882665568]
Depression based on physiological signals such as functional near-infrared spectroscopy (NIRS) and electroencephalogram (EEG) has made considerable progress.
In this paper, we introduce a multimodal physiological signals representation learning framework using architecture via multiscale contrasting for depression recognition (MRLM)
To enhance the learning of semantic representation associated with stimulation tasks, a semantic contrast module is proposed.
arXiv Detail & Related papers (2024-06-22T09:28:02Z) - Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model.
We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE)
This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z) - MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints [50.61346764110482]
We integrate a musculoskeletal system with a learnable parametric hand model, MANO, to create MS-MANO.
This model emulates the dynamics of muscles and tendons to drive the skeletal system, imposing physiologically realistic constraints on the resulting torque trajectories.
We also propose a simulation-in-the-loop pose refinement framework, BioPR, that refines the initial estimated pose through a multi-layer perceptron network.
arXiv Detail & Related papers (2024-04-16T02:18:18Z) - Real-Time Model-Based Quantitative Ultrasound and Radar [65.268245109828]
We propose a neural network based on the physical model of wave propagation, which defines the relationship between the received signals and physical properties.
Our network can reconstruct multiple physical properties in less than one second for complex and realistic scenarios.
arXiv Detail & Related papers (2024-02-16T09:09:16Z) - Dual-path TokenLearner for Remote Photoplethysmography-based
Physiological Measurement with Facial Videos [24.785755814666086]
This paper utilizes the concept of learnable tokens to integrate both spatial and temporal informative contexts from the global perspective of the video.
A Temporal TokenLearner (TTL) is designed to infer the quasi-periodic pattern of heartbeats, which eliminates temporal disturbances such as head movements.
arXiv Detail & Related papers (2023-08-15T13:45:45Z) - PhysFormer++: Facial Video-based Physiological Measurement with SlowFast
Temporal Difference Transformer [76.40106756572644]
Recent deep learning approaches focus on mining subtle clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose two end-to-end video transformer based on PhysFormer and Phys++++, to adaptively aggregate both local and global features for r representation enhancement.
Comprehensive experiments are performed on four benchmark datasets to show our superior performance on both intra-temporal and cross-dataset testing.
arXiv Detail & Related papers (2023-02-07T15:56:03Z) - DRNet: Decomposition and Reconstruction Network for Remote Physiological
Measurement [39.73408626273354]
Existing methods are generally divided into two groups.
The first focuses on mining the subtle volume pulse (BVP) signals from face videos, but seldom explicitly models the noises that dominate face video content.
The second focuses on modeling noisy data directly, resulting in suboptimal performance due to the lack of regularity of these severe random noises.
arXiv Detail & Related papers (2022-06-12T07:40:10Z) - PhysFormer: Facial Video-based Physiological Measurement with Temporal
Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z) - Video-based Remote Physiological Measurement via Cross-verified Feature
Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations.
We then use the distilled physiological features for robust multi-task physiological measurements.
The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.