PhysMamba: State Space Duality Model for Remote Physiological Measurement
- URL: http://arxiv.org/abs/2408.01077v2
- Date: Sat, 17 Aug 2024 12:52:03 GMT
- Title: PhysMamba: State Space Duality Model for Remote Physiological Measurement
- Authors: Zhixin Yan, Yan Zhong, Hongbin Xu, Wenjun Zhang, Lin Shu, Hongbin Xu, Wenxiong Kang,
- Abstract summary: Remote Photoplethysmography (rBFC) is used in applications like emotion monitoring, medical assistance, and anti-face spoofing.
Unlike controlled laboratory settings, real-world environments often contain motion artifacts and noise.
We propose PhysMamba, a dual-Path-frequency model via State Space Duality.
This method allows the network to learn richer, more representative features, enhancing robustness in noisy conditions.
- Score: 20.441281420017656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Remote Photoplethysmography (rPPG) is a non-contact technique for extracting physiological signals from facial videos, used in applications like emotion monitoring, medical assistance, and anti-face spoofing. Unlike controlled laboratory settings, real-world environments often contain motion artifacts and noise, affecting the performance of existing rPPG methods. To address this, we propose PhysMamba, a dual-Pathway time-frequency interaction model via State Space Duality. This method allows the network to learn richer, more representative features, enhancing robustness in noisy conditions. To facilitate information exchange and feature complementation between the two pathways, we design an improved algorithm: Cross-Attention State Space Duality (CASSD). We conduct comparative experiments on the PURE, UBFC-rPPG, and MMPD datasets. Experimental results show that PhysMamba achieves state-of-the-art performance, particularly in complex environments, demonstrating its potential in practical remote physiological signal measurement applications.
Related papers
- SkelMamba: A State Space Model for Efficient Skeleton Action Recognition of Neurological Disorders [14.304356695180005]
We introduce a novel state-space model (SSM)-based framework for skeleton-based human action recognition.
Our model captures local joint interactions and global motion patterns across multiple body parts.
This gait-aware decomposition enhances the ability to identify subtle motion patterns critical in medical diagnosis.
arXiv Detail & Related papers (2024-11-29T08:43:52Z) - Multimodal Physiological Signals Representation Learning via Multiscale Contrasting for Depression Recognition [18.65975882665568]
Depression based on physiological signals such as functional near-infrared spectroscopy (NIRS) and electroencephalogram (EEG) has made considerable progress.
In this paper, we introduce a multimodal physiological signals representation learning framework using architecture via multiscale contrasting for depression recognition (MRLM)
To enhance the learning of semantic representation associated with stimulation tasks, a semantic contrast module is proposed.
arXiv Detail & Related papers (2024-06-22T09:28:02Z) - Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model.
We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE)
This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z) - MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints [50.61346764110482]
We integrate a musculoskeletal system with a learnable parametric hand model, MANO, to create MS-MANO.
This model emulates the dynamics of muscles and tendons to drive the skeletal system, imposing physiologically realistic constraints on the resulting torque trajectories.
We also propose a simulation-in-the-loop pose refinement framework, BioPR, that refines the initial estimated pose through a multi-layer perceptron network.
arXiv Detail & Related papers (2024-04-16T02:18:18Z) - Real-Time Model-Based Quantitative Ultrasound and Radar [65.268245109828]
We propose a neural network based on the physical model of wave propagation, which defines the relationship between the received signals and physical properties.
Our network can reconstruct multiple physical properties in less than one second for complex and realistic scenarios.
arXiv Detail & Related papers (2024-02-16T09:09:16Z) - AI-Aristotle: A Physics-Informed framework for Systems Biology Gray-Box
Identification [1.8434042562191815]
We present a new framework for parameter estimation and missing physics identification (gray-box) in Systems Biology.
The proposed framework -- named AI-Aristotle -- combines eXtreme Theory of Functional Connections (X-TFC) domain-decomposition and Physics-Informed Neural Networks (PINNs)
We test the accuracy, speed, flexibility and robustness of AI-Aristotle based on two benchmark problems in Systems Biology.
arXiv Detail & Related papers (2023-09-29T14:45:51Z) - Physics-informed State-space Neural Networks for Transport Phenomena [0.0]
This work introduces Physics-informed State-space neural network Models (PSMs)
PSMs are a novel solution to achieving real-time optimization, flexibility, and fault tolerance in autonomous systems.
PSMs could serve as a foundation for Digital Twins, constantly updated digital representations of physical systems.
arXiv Detail & Related papers (2023-09-21T16:14:36Z) - Dual-path TokenLearner for Remote Photoplethysmography-based
Physiological Measurement with Facial Videos [24.785755814666086]
This paper utilizes the concept of learnable tokens to integrate both spatial and temporal informative contexts from the global perspective of the video.
A Temporal TokenLearner (TTL) is designed to infer the quasi-periodic pattern of heartbeats, which eliminates temporal disturbances such as head movements.
arXiv Detail & Related papers (2023-08-15T13:45:45Z) - PhysFormer++: Facial Video-based Physiological Measurement with SlowFast
Temporal Difference Transformer [76.40106756572644]
Recent deep learning approaches focus on mining subtle clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose two end-to-end video transformer based on PhysFormer and Phys++++, to adaptively aggregate both local and global features for r representation enhancement.
Comprehensive experiments are performed on four benchmark datasets to show our superior performance on both intra-temporal and cross-dataset testing.
arXiv Detail & Related papers (2023-02-07T15:56:03Z) - DRNet: Decomposition and Reconstruction Network for Remote Physiological
Measurement [39.73408626273354]
Existing methods are generally divided into two groups.
The first focuses on mining the subtle volume pulse (BVP) signals from face videos, but seldom explicitly models the noises that dominate face video content.
The second focuses on modeling noisy data directly, resulting in suboptimal performance due to the lack of regularity of these severe random noises.
arXiv Detail & Related papers (2022-06-12T07:40:10Z) - PhysFormer: Facial Video-based Physiological Measurement with Temporal
Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z) - Non-contact Pain Recognition from Video Sequences with Remote
Physiological Measurements Prediction [53.03469655641418]
We present a novel multi-task learning framework which encodes both appearance changes and physiological cues in a non-contact manner for pain recognition.
We establish the state-of-the-art performance of non-contact pain recognition on publicly available pain databases.
arXiv Detail & Related papers (2021-05-18T20:47:45Z) - Data-driven generation of plausible tissue geometries for realistic
photoacoustic image synthesis [53.65837038435433]
Photoacoustic tomography (PAT) has the potential to recover morphological and functional tissue properties.
We propose a novel approach to PAT data simulation, which we refer to as "learning to simulate"
We leverage the concept of Generative Adversarial Networks (GANs) trained on semantically annotated medical imaging data to generate plausible tissue geometries.
arXiv Detail & Related papers (2021-03-29T11:30:18Z) - Video-based Remote Physiological Measurement via Cross-verified Feature
Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations.
We then use the distilled physiological features for robust multi-task physiological measurements.
The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.