Resolve Domain Conflicts for Generalizable Remote Physiological Measurement
- URL: http://arxiv.org/abs/2404.07855v1
- Date: Thu, 11 Apr 2024 15:51:52 GMT
- Title: Resolve Domain Conflicts for Generalizable Remote Physiological Measurement
- Authors: Weiyu Sun, Xinyu Zhang, Hao Lu, Ying Chen, Yun Ge, Xiaolin Huang, Jie Yuan, Yingcong Chen,
- Abstract summary: We introduce the DOmain-Honious framework (DOHA) for remote photoplethysmography.
We propose a phase strategy to eliminate uncertain phase delays and preserve the variation of physiological signals.
Next, we design a hyperplane optimization that reduces irrelevant attribute shifts.
Our experiments significantly improve the performance of existing methods under multiple protocols.
- Score: 39.0083078989343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Remote photoplethysmography (rPPG) technology has become increasingly popular due to its non-invasive monitoring of various physiological indicators, making it widely applicable in multimedia interaction, healthcare, and emotion analysis. Existing rPPG methods utilize multiple datasets for training to enhance the generalizability of models. However, they often overlook the underlying conflict issues across different datasets, such as (1) label conflict resulting from different phase delays between physiological signal labels and face videos at the instance level, and (2) attribute conflict stemming from distribution shifts caused by head movements, illumination changes, skin types, etc. To address this, we introduce the DOmain-HArmonious framework (DOHA). Specifically, we first propose a harmonious phase strategy to eliminate uncertain phase delays and preserve the temporal variation of physiological signals. Next, we design a harmonious hyperplane optimization that reduces irrelevant attribute shifts and encourages the model's optimization towards a global solution that fits more valid scenarios. Our experiments demonstrate that DOHA significantly improves the performance of existing methods under multiple protocols. Our code is available at https://github.com/SWY666/rPPG-DOHA.
Related papers
- InceptoFormer: A Multi-Signal Neural Framework for Parkinson's Disease Severity Evaluation from Gait [6.155129200870887]
InceptoFormer is a multi-signal neural framework designed for Parkinson's Disease (PD) severity evaluation via gait dynamics analysis.<n>Our architecture introduces a 1D adaptation of the Inception model, which we refer to as Inception1D, along with a Transformer-based framework to stage PD severity according to the Hoehn and Yahr (H&Y) scale.<n>InceptoFormer achieves an accuracy of 96.6%, outperforming existing state-of-the-art methods in PD severity assessment.
arXiv Detail & Related papers (2025-08-06T15:27:11Z) - UniSegDiff: Boosting Unified Lesion Segmentation via a Staged Diffusion Model [53.34835793648352]
We propose UniSegDiff, a novel diffusion model framework for lesion segmentation.<n>UniSegDiff addresses lesion segmentation in a unified manner across multiple modalities and organs.<n> Comprehensive experimental results demonstrate that UniSegDiff significantly outperforms previous state-of-the-art (SOTA) approaches.
arXiv Detail & Related papers (2025-07-24T12:33:10Z) - Learning from Heterogeneity: Generalizing Dynamic Facial Expression Recognition via Distributionally Robust Optimization [23.328511708942045]
Heterogeneity-aware Distributional Framework (HDF) designed to enhance time-frequency modeling and mitigate imbalance caused by hard samples.<n>Time-Frequency Distributional Attention Module (DAM) captures both temporal consistency and frequency robustness.<n> adaptive optimization module Distribution-aware Scaling Module (DSM) introduced to dynamically balance classification and contrastive losses.
arXiv Detail & Related papers (2025-07-21T16:21:47Z) - Multimodal LLM-Guided Semantic Correction in Text-to-Image Diffusion [52.315729095824906]
MLLM Semantic-Corrected Ping-Pong-Ahead Diffusion (PPAD) is a novel framework that introduces a Multimodal Large Language Model (MLLM) as a semantic observer during inference.<n>It performs real-time analysis on intermediate generations, identifies latent semantic inconsistencies, and translates feedback into controllable signals that actively guide the remaining denoising steps.<n>Extensive experiments demonstrate PPAD's significant improvements.
arXiv Detail & Related papers (2025-05-26T14:42:35Z) - PhysLLM: Harnessing Large Language Models for Cross-Modal Remote Physiological Sensing [49.243031514520794]
Large Language Models (LLMs) excel at capturing long-range signals due to their text-centric design.<n>PhysLLM achieves state-the-art accuracy and robustness, demonstrating superior generalization across lighting variations and motion scenarios.
arXiv Detail & Related papers (2025-05-06T15:18:38Z) - Test-Time Domain Generalization via Universe Learning: A Multi-Graph Matching Approach for Medical Image Segmentation [17.49123106322442]
Test-time adaptation (TTA) adjusts a learned model using unlabeled test data.
We incorporate morphological information and propose a framework based on multi-graph matching.
Our method outperforms other state-of-the-art approaches on two medical image segmentation benchmarks.
arXiv Detail & Related papers (2025-03-17T10:11:11Z) - Raising the Bar in Graph OOD Generalization: Invariant Learning Beyond Explicit Environment Modeling [58.15601237755505]
Real-world graph data often exhibit diverse and shifting environments that traditional models fail to generalize across.
We propose a novel method termed Multi-Prototype Hyperspherical Invariant Learning (MPHIL)
MPHIL achieves state-of-the-art performance, significantly outperforming existing methods across graph data from various domains and with different distribution shifts.
arXiv Detail & Related papers (2025-02-15T07:40:14Z) - Spatial Adaptation Layer: Interpretable Domain Adaptation For Biosignal Sensor Array Applications [0.7499722271664147]
Biosignal acquisition is key for healthcare applications and wearable devices.
Existing solutions often require large and expensive datasets and/or lack robustness and interpretability.
We propose the Spatial Adaptation Layer (SAL), which can be prepended to any biosignal array model.
We also introduce learnable baseline normalization (LBN) to reduce baseline fluctuations.
arXiv Detail & Related papers (2024-09-12T14:06:12Z) - Multi-Source and Test-Time Domain Adaptation on Multivariate Signals using Spatio-Temporal Monge Alignment [59.75420353684495]
Machine learning applications on signals such as computer vision or biomedical data often face challenges due to the variability that exists across hardware devices or session recordings.
In this work, we propose Spatio-Temporal Monge Alignment (STMA) to mitigate these variabilities.
We show that STMA leads to significant and consistent performance gains between datasets acquired with very different settings.
arXiv Detail & Related papers (2024-07-19T13:33:38Z) - Decomposing the Neurons: Activation Sparsity via Mixture of Experts for Continual Test Time Adaptation [37.79819260918366]
Continual Test-Time Adaptation (CTTA) aims to adapt the pre-trained model to ever-evolving target domains.
We explore the integration of a Mixture-of-Activation-Sparsity-Experts (MoASE) as an adapter for the CTTA task.
arXiv Detail & Related papers (2024-05-26T08:51:39Z) - Multi-channel Time Series Decomposition Network For Generalizable Sensor-Based Activity Recognition [2.024925013349319]
This paper proposes a new method, Multi-channel Time Series Decomposition Network (MTSDNet)
It decomposes the original signal into a combination of multiple components and trigonometric functions by the trainable parameterized temporal decomposition.
It shows the advantages in predicting accuracy and stability of our method compared with other competing strategies.
arXiv Detail & Related papers (2024-03-28T12:54:06Z) - Generating Progressive Images from Pathological Transitions via
Diffusion Model [12.006910992162661]
We propose an adaptive depth-controlled diffusion network to generate pathological progressive images for effective data augmentation.
Experiments suggest significant improvements in generation diversity, and the effectiveness with generated progressive samples are highlighted in downstream classifications.
arXiv Detail & Related papers (2023-11-21T03:25:51Z) - Domain Adaptive Synapse Detection with Weak Point Annotations [63.97144211520869]
We present AdaSyn, a framework for domain adaptive synapse detection with weak point annotations.
In the WASPSYN challenge at I SBI 2023, our method ranks the 1st place.
arXiv Detail & Related papers (2023-08-31T05:05:53Z) - Morphological feature visualization of Alzheimer's disease via
Multidirectional Perception GAN [40.50404819220093]
A novel Multidirectional Perception Generative Adversarial Network (MP-GAN) is proposed to visualize the morphological features indicating the severity of Alzheimer's disease (AD)
MP-GAN achieves superior performance compared with the existing methods.
arXiv Detail & Related papers (2021-11-25T03:24:52Z) - PhysFormer: Facial Video-based Physiological Measurement with Temporal
Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z) - Heterogeneous Face Frontalization via Domain Agnostic Learning [74.86585699909459]
We propose a domain agnostic learning-based generative adversarial network (DAL-GAN) which can synthesize frontal views in the visible domain from thermal faces with pose variations.
DAL-GAN consists of a generator with an auxiliary classifier and two discriminators which capture both local and global texture discriminations for better synthesis.
arXiv Detail & Related papers (2021-07-17T20:41:41Z) - Video-based Remote Physiological Measurement via Cross-verified Feature
Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations.
We then use the distilled physiological features for robust multi-task physiological measurements.
The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.