CardiacMamba: A Multimodal RGB-RF Fusion Framework with State Space Models for Remote Physiological Measurement
- URL: http://arxiv.org/abs/2502.13624v1
- Date: Wed, 19 Feb 2025 11:00:34 GMT
- Title: CardiacMamba: A Multimodal RGB-RF Fusion Framework with State Space Models for Remote Physiological Measurement
- Authors: Zheng Wu, Yiping Xie, Bo Zhao, Jiguang He, Fei Luo, Ning Deng, Zitong Yu,
- Abstract summary: Heart rate (HR) estimation via remote photoplethys (rPl) offers a non-invasive solution for health monitoring.
Traditional single-modality approaches (RGB or Radio Frequency (RF)) face challenges in balancing robustness and accuracy due to lighting variations, motion artifacts, and skin tone bias.
We propose CardiacMamba, a multimodal RGB-RF fusion framework that leverages the complementary strengths of both modalities.
- Score: 24.511384674989223
- License:
- Abstract: Heart rate (HR) estimation via remote photoplethysmography (rPPG) offers a non-invasive solution for health monitoring. However, traditional single-modality approaches (RGB or Radio Frequency (RF)) face challenges in balancing robustness and accuracy due to lighting variations, motion artifacts, and skin tone bias. In this paper, we propose CardiacMamba, a multimodal RGB-RF fusion framework that leverages the complementary strengths of both modalities. It introduces the Temporal Difference Mamba Module (TDMM) to capture dynamic changes in RF signals using timing differences between frames, enhancing the extraction of local and global features. Additionally, CardiacMamba employs a Bidirectional SSM for cross-modal alignment and a Channel-wise Fast Fourier Transform (CFFT) to effectively capture and refine the frequency domain characteristics of RGB and RF signals, ultimately improving heart rate estimation accuracy and periodicity detection. Extensive experiments on the EquiPleth dataset demonstrate state-of-the-art performance, achieving marked improvements in accuracy and robustness. CardiacMamba significantly mitigates skin tone bias, reducing performance disparities across demographic groups, and maintains resilience under missing-modality scenarios. By addressing critical challenges in fairness, adaptability, and precision, the framework advances rPPG technology toward reliable real-world deployment in healthcare. The codes are available at: https://github.com/WuZheng42/CardiacMamba.
Related papers
- FE-UNet: Frequency Domain Enhanced U-Net with Segment Anything Capability for Versatile Image Segmentation [50.9040167152168]
We experimentally quantify the contrast sensitivity function of CNNs and compare it with that of the human visual system.
We propose the Wavelet-Guided Spectral Pooling Module (WSPM) to enhance and balance image features across the frequency domain.
To further emulate the human visual system, we introduce the Frequency Domain Enhanced Receptive Field Block (FE-RFB)
We develop FE-UNet, a model that utilizes SAM2 as its backbone and incorporates Hiera-Large as a pre-trained block.
arXiv Detail & Related papers (2025-02-06T07:24:34Z) - Fast-RF-Shimming: Accelerate RF Shimming in 7T MRI using Deep Learning [16.39978444212565]
Higher fields introduce challenges such as radiofrequency transmit (RF) field inhomogeneities, which result in uneven flip angles and image intensity artifacts.
Traditional RF shimming methods, including Magnitude Least Squares (MLS) optimization, mitigate RF field inhomogeneity but are time-intensive and often require the presence of the patient.
Recent machine learning methods, such as RF Shim Prediction by Iteratively Projected Ridge Regression, offer alternative approaches but face challenges such as extensive training requirements.
This paper introduces a holistic learning-based framework called Fast RF Shimming, which achieves a 5000-fold speed
arXiv Detail & Related papers (2025-01-21T14:09:58Z) - FgC2F-UDiff: Frequency-guided and Coarse-to-fine Unified Diffusion Model for Multi-modality Missing MRI Synthesis [6.475175425060296]
We propose a novel unified synthesis model, the Frequency-guided and Coarse-to-fine Unified Diffusion Model (FgC2F-UDiff)
arXiv Detail & Related papers (2025-01-07T04:42:45Z) - MHSA: A Multi-scale Hypergraph Network for Mild Cognitive Impairment Detection via Synchronous and Attentive Fusion [4.526574526136158]
A Multi-scale Hypergraph Network for MCI Detection via Synchronous and Attentive Fusion is presented.
Our approach employs the Phase-Locking Value (PLV) to calculate the phase synchronization relationship in the spectrum domain of regions of interest.
We structure the PLV coefficients dynamically adjust strategy, and the dynamic hypergraph is modelled based on a comprehensive temporal-spectrum fusion matrix.
arXiv Detail & Related papers (2024-12-11T02:59:57Z) - Accelerated Multi-Contrast MRI Reconstruction via Frequency and Spatial Mutual Learning [50.74383395813782]
We propose a novel Frequency and Spatial Mutual Learning Network (FSMNet) to explore global dependencies across different modalities.
The proposed FSMNet achieves state-of-the-art performance for the Multi-Contrast MR Reconstruction task with different acceleration factors.
arXiv Detail & Related papers (2024-09-21T12:02:47Z) - Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution [49.902047563260496]
We develop the first attempt to integrate the Vision State Space Model (Mamba) for remote sensing image (RSI) super-resolution.
To achieve better SR reconstruction, building upon Mamba, we devise a Frequency-assisted Mamba framework, dubbed FMSR.
Our FMSR features a multi-level fusion architecture equipped with the Frequency Selection Module (FSM), Vision State Space Module (VSSM), and Hybrid Gate Module (HGM)
arXiv Detail & Related papers (2024-05-08T11:09:24Z) - TAI-GAN: A Temporally and Anatomically Informed Generative Adversarial
Network for early-to-late frame conversion in dynamic cardiac PET inter-frame
motion correction [15.380659401728735]
We propose a novel method called Temporally and Anatomically Informed Generative Adrial Network (TAI-GAN) to convert early frames into those with tracer distribution similar to the last reference frame.
Our proposed method was evaluated on a clinical 82-Rb PET dataset, and the results show that our TAI-GAN can produce converted early frames with high image quality, comparable to the real reference frames.
arXiv Detail & Related papers (2024-02-14T20:39:07Z) - Frequency Domain Modality-invariant Feature Learning for
Visible-infrared Person Re-Identification [79.9402521412239]
We propose a novel Frequency Domain modality-invariant feature learning framework (FDMNet) to reduce modality discrepancy from the frequency domain perspective.
Our framework introduces two novel modules, namely the Instance-Adaptive Amplitude Filter (IAF) and the Phrase-Preserving Normalization (PPNorm)
arXiv Detail & Related papers (2024-01-03T17:11:27Z) - Diffusion Probabilistic Model Made Slim [128.2227518929644]
We introduce a customized design for slim diffusion probabilistic models (DPM) for light-weight image synthesis.
We achieve 8-18x computational complexity reduction as compared to the latent diffusion models on a series of conditional and unconditional image generation tasks.
arXiv Detail & Related papers (2022-11-27T16:27:28Z) - Video-based Remote Physiological Measurement via Cross-verified Feature
Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations.
We then use the distilled physiological features for robust multi-task physiological measurements.
The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z) - Efficient and Phase-aware Video Super-resolution for Cardiac MRI [23.5319835123499]
We propose a novel end-to-end trainable network to solve CMR video super-resolution problem.
We incorporate the cardiac knowledge into our model to assist in utilizing the temporal information.
arXiv Detail & Related papers (2020-05-21T13:29:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.