PhysMLE: Generalizable and Priors-Inclusive Multi-task Remote Physiological Measurement
- URL: http://arxiv.org/abs/2405.06201v2
- Date: Tue, 29 Oct 2024 04:15:41 GMT
- Title: PhysMLE: Generalizable and Priors-Inclusive Multi-task Remote Physiological Measurement
- Authors: Jiyao Wang, Hao Lu, Ange Wang, Xiao Yang, Yingcong Chen, Dengbo He, Kaishun Wu,
- Abstract summary: This paper presents an end-to-end Mixture of Low-rank Experts for multi-task remote Physiological measurement (PhysMLE)
PhysMLE is based on multiple low-rank experts with a novel router mechanism, enabling the model to adeptly handle both specifications and correlations within tasks.
For fair and comprehensive evaluations, this paper proposed a large-scale multi-task generalization benchmark, named Multi-Source Synsemantic Domain Generalization protocol.
- Score: 24.424510759648072
- License:
- Abstract: Remote photoplethysmography (rPPG) has been widely applied to measure heart rate from face videos. To increase the generalizability of the algorithms, domain generalization (DG) attracted increasing attention in rPPG. However, when rPPG is extended to simultaneously measure more vital signs (e.g., respiration and blood oxygen saturation), achieving generalizability brings new challenges. Although partial features shared among different physiological signals can benefit multi-task learning, the sparse and imbalanced target label space brings the seesaw effect over task-specific feature learning. To resolve this problem, we designed an end-to-end Mixture of Low-rank Experts for multi-task remote Physiological measurement (PhysMLE), which is based on multiple low-rank experts with a novel router mechanism, thereby enabling the model to adeptly handle both specifications and correlations within tasks. Additionally, we introduced prior knowledge from physiology among tasks to overcome the imbalance of label space under real-world multi-task physiological measurement. For fair and comprehensive evaluations, this paper proposed a large-scale multi-task generalization benchmark, named Multi-Source Synsemantic Domain Generalization (MSSDG) protocol. Extensive experiments with MSSDG and intra-dataset have shown the effectiveness and efficiency of PhysMLE. In addition, a new dataset was collected and made publicly available to meet the needs of the MSSDG.
Related papers
- YOLO-MED : Multi-Task Interaction Network for Biomedical Images [18.535117490442953]
YOLO-Med is an efficient end-to-end multi-task network capable of concurrently performing object detection and semantic segmentation.
Our model exhibits promising results in balancing accuracy and speed when evaluated on the Kvasir-seg dataset and a private biomedical image dataset.
arXiv Detail & Related papers (2024-03-01T03:20:42Z) - Active Neural Topological Mapping for Multi-Agent Exploration [24.91397816926568]
Multi-agent cooperative exploration problem requires multiple agents to explore an unseen environment via sensory signals in a limited time.
Topological maps are a promising alternative as they consist only of nodes and edges with abstract but essential information.
Deep reinforcement learning has shown great potential for learning (near) optimal policies through fast end-to-end inference.
We propose Multi-Agent Neural Topological Mapping (MANTM) to improve exploration efficiency and generalization for multi-agent exploration tasks.
arXiv Detail & Related papers (2023-11-01T03:06:14Z) - A Deep Learning Sequential Decoder for Transient High-Density
Electromyography in Hand Gesture Recognition Using Subject-Embedded Transfer
Learning [11.170031300110315]
Hand gesture recognition (HGR) has gained significant attention due to the increasing use of AI-powered human-computers.
These interfaces have a range of applications, including the control of extended reality, agile prosthetics, and exoskeletons.
These interfaces have a range of applications, including the control of extended reality, agile prosthetics, and exoskeletons.
arXiv Detail & Related papers (2023-09-23T05:32:33Z) - PhysFormer++: Facial Video-based Physiological Measurement with SlowFast
Temporal Difference Transformer [76.40106756572644]
Recent deep learning approaches focus on mining subtle clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose two end-to-end video transformer based on PhysFormer and Phys++++, to adaptively aggregate both local and global features for r representation enhancement.
Comprehensive experiments are performed on four benchmark datasets to show our superior performance on both intra-temporal and cross-dataset testing.
arXiv Detail & Related papers (2023-02-07T15:56:03Z) - Benchmarking Joint Face Spoofing and Forgery Detection with Visual and
Physiological Cues [81.15465149555864]
We establish the first joint face spoofing and detection benchmark using both visual appearance and physiological r cues.
To enhance the r periodicity discrimination, we design a two-branch physiological network using both facial powerful rtemporal signal map and its continuous wavelet transformed counterpart as inputs.
arXiv Detail & Related papers (2022-08-10T15:41:48Z) - Superficial White Matter Analysis: An Efficient Point-cloud-based Deep
Learning Framework with Supervised Contrastive Learning for Consistent
Tractography Parcellation across Populations and dMRI Acquisitions [68.41088365582831]
White matter parcellation classifies tractography streamlines into clusters or anatomically meaningful tracts.
Most parcellation methods focus on the deep white matter (DWM), whereas fewer methods address the superficial white matter (SWM) due to its complexity.
We propose a novel two-stage deep-learning-based framework, Superficial White Matter Analysis (SupWMA), that performs an efficient parcellation of 198 SWM clusters from whole-brain tractography.
arXiv Detail & Related papers (2022-07-18T23:07:53Z) - GMSS: Graph-Based Multi-Task Self-Supervised Learning for EEG Emotion
Recognition [48.02958969607864]
This paper proposes a graph-based multi-task self-supervised learning model (GMSS) for EEG emotion recognition.
By learning from multiple tasks simultaneously, GMSS can find a representation that captures all of the tasks.
Experiments on SEED, SEED-IV, and MPED datasets show that the proposed model has remarkable advantages in learning more discriminative and general features for EEG emotional signals.
arXiv Detail & Related papers (2022-04-12T03:37:21Z) - PhysFormer: Facial Video-based Physiological Measurement with Temporal
Difference Transformer [55.936527926778695]
Recent deep learning approaches focus on mining subtle r clues using convolutional neural networks with limited-temporal receptive fields.
In this paper, we propose the PhysFormer, an end-to-end video transformer based architecture.
arXiv Detail & Related papers (2021-11-23T18:57:11Z) - Video-based Remote Physiological Measurement via Cross-verified Feature
Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations.
We then use the distilled physiological features for robust multi-task physiological measurements.
The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.