QualityFM: a Multimodal Physiological Signal Foundation Model with Self-Distillation for Signal Quality Challenges in Critically Ill Patients
- URL: http://arxiv.org/abs/2509.06516v2
- Date: Sun, 14 Sep 2025 12:44:02 GMT
- Title: QualityFM: a Multimodal Physiological Signal Foundation Model with Self-Distillation for Signal Quality Challenges in Critically Ill Patients
- Authors: Zongheng Guo, Tao Chen, Manuela Ferrario,
- Abstract summary: Photoplethysmogram ( PPG) and electrocardiogram (ECG) are commonly recorded intesive care unit (ICU) and operating room (OR)<n>The methods explored so far suffer from limited generalizability, reliance on extensive labeled data, and poor cross-task transferability.<n>We introduce QualityFM, a novel multimodal foundation model for these physiological signals, designed to acquire a general-purpose understanding of signal quality.
- Score: 3.744688073010934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Photoplethysmogram (PPG) and electrocardiogram (ECG) are commonly recorded in intesive care unit (ICU) and operating room (OR). However, the high incidence of poor, incomplete, and inconsistent signal quality, can lead to false alarms or diagnostic inaccuracies. The methods explored so far suffer from limited generalizability, reliance on extensive labeled data, and poor cross-task transferability. To overcome these challenges, we introduce QualityFM, a novel multimodal foundation model for these physiological signals, designed to acquire a general-purpose understanding of signal quality. Our model is pre-trained on an large-scale dataset comprising over 21 million 30-second waveforms and 179,757 hours of data. Our approach involves a dual-track architecture that processes paired physiological signals of differing quality, leveraging a self-distillation strategy where an encoder for high-quality signals is used to guide the training of an encoder for low-quality signals. To efficiently handle long sequential signals and capture essential local quasi-periodic patterns, we integrate a windowed sparse attention mechanism within our Transformer-based model. Furthermore, a composite loss function, which combines direct distillation loss on encoder outputs with indirect reconstruction loss based on power and phase spectra, ensures the preservation of frequency-domain characteristics of the signals. We pre-train three models with varying parameter counts (9.6 M to 319 M) and demonstrate their efficacy and practical value through transfer learning on three distinct clinical tasks: false alarm of ventricular tachycardia detection, the identification of atrial fibrillation and the estimation of arterial blood pressure (ABP) from PPG and ECG signals.
Related papers
- FDP: A Frequency-Decomposition Preprocessing Pipeline for Unsupervised Anomaly Detection in Brain MRI [44.4791295950757]
We develop an unsupervised anomaly detection (UAD) approach for brain MRI.<n>We conduct the first systematic frequency-domain analysis of pathological signatures.<n>We show that Frequency-Decomposition Preprocessing (FDP) framework can leverage frequency-domain reconstruction for simultaneous pathology suppression and anatomical preservation.
arXiv Detail & Related papers (2025-11-17T02:40:14Z) - TF-TransUNet1D: Time-Frequency Guided Transformer U-Net for Robust ECG Denoising in Digital Twin [16.693268731997996]
We propose TF-TransUNet1D, a novel one-dimensional deep neural network that integrates a U-Net-based encoder-decoder architecture with a Transformer encoder.<n>The model is designed to simultaneously capture local morphological features and long-range temporal dependencies.<n>By delivering high-precision denoising, this work bridges a critical gap in pre-processing pipelines for cardiac digital twins.
arXiv Detail & Related papers (2025-08-28T03:51:19Z) - A Novel Data Augmentation Strategy for Robust Deep Learning Classification of Biomedical Time-Series Data: Application to ECG and EEG Analysis [2.355460994057843]
This study proposes a novel and unified deep learning framework that achieves state-of-the-art performance across different signal types.<n>Unlike prior work, we scientifically increase signal complexity to achieve future-reaching capabilities, which resulted in the best predictions.<n>The architecture requires 130 MB of memory and processes each sample in 10 ms, suggesting suitability for deployment on low-end or wearable devices.
arXiv Detail & Related papers (2025-07-16T21:38:10Z) - PhysioWave: A Multi-Scale Wavelet-Transformer for Physiological Signal Representation [18.978031999678507]
A novel wavelet-based approach for physiological signal analysis is presented, aiming to capture multi-scale time-frequency features in various physiological signals.<n>Two large-scale pretrained models specific to EMG and ECG are introduced for the first time, achieving superior performance and setting new baselines in downstream tasks.<n>A unified multi-modal framework is constructed by integrating pretrained EEG model, where each modality is guided through its dedicated branch and fused via learnable weighted fusion.
arXiv Detail & Related papers (2025-06-12T05:11:41Z) - Auto-FEDUS: Autoregressive Generative Modeling of Doppler Ultrasound Signals from Fetal Electrocardiograms [3.1295375129864644]
We introduce a novel autoregressive generative model designed to map fetal electrocardiogram (FECG) signals to corresponding DUS waveforms (Auto-FEDUS)<n>By leveraging a neural temporal network based on dilated causal convolutions, the model effectively captures both short and long-range dependencies within the signals, preserving the integrity of generated data.<n>Cross-subject experiments demonstrate that Auto-FEDUS outperforms conventional generative architectures across both time and frequency domain evaluations.
arXiv Detail & Related papers (2025-04-17T15:25:52Z) - CEReBrO: Compact Encoder for Representations of Brain Oscillations Using Efficient Alternating Attention [53.539020807256904]
We introduce a Compact for Representations of Brain Oscillations using alternating attention (CEReBrO)<n>Our tokenization scheme represents EEG signals at a per-channel patch.<n>We propose an alternating attention mechanism that jointly models intra-channel temporal dynamics and inter-channel spatial correlations, achieving 2x speed improvement with 6x less memory required compared to standard self-attention.
arXiv Detail & Related papers (2025-01-18T21:44:38Z) - Synthetic Time Series Data Generation for Healthcare Applications: A PCG Case Study [43.28613210217385]
We employ and compare three state-of-the-art generative models to generate PCG data.<n>Our results demonstrate that the generated PCG data closely resembles the original datasets.<n>In our future work, we plan to incorporate this method into a data augmentation pipeline to synthesize abnormal PCG signals with heart murmurs.
arXiv Detail & Related papers (2024-12-17T18:07:40Z) - TDANet: A Novel Temporal Denoise Convolutional Neural Network With Attention for Fault Diagnosis [0.5277756703318045]
This paper proposes the Temporal Denoise Convolutional Neural Network With Attention (TDANet) to improve fault diagnosis performance in noise environments.
The TDANet model transforms one-dimensional signals into two-dimensional tensors based on their periodic properties, employing multi-scale 2D convolution kernels to extract signal information both within and across periods.
Evaluation on two datasets, CWRU (single sensor) and Real aircraft sensor fault (multiple sensors), demonstrates that the TDANet model significantly outperforms existing deep learning approaches in terms of diagnostic accuracy under noisy environments.
arXiv Detail & Related papers (2024-03-29T02:54:41Z) - Multiple Time Series Fusion Based on LSTM An Application to CAP A Phase
Classification Using EEG [56.155331323304]
Deep learning based electroencephalogram channels' feature level fusion is carried out in this work.
Channel selection, fusion, and classification procedures were optimized by two optimization algorithms.
arXiv Detail & Related papers (2021-12-18T14:17:49Z) - Generalizing electrocardiogram delineation: training convolutional
neural networks with synthetic data augmentation [63.51064808536065]
Existing databases for ECG delineation are small, being insufficient in size and in the array of pathological conditions they represent.
This article delves has two main contributions. First, a pseudo-synthetic data generation algorithm was developed, based in probabilistically composing ECG traces given "pools" of fundamental segments, as cropped from the original databases, and a set of rules for their arrangement into coherent synthetic traces.
Second, two novel segmentation-based loss functions have been developed, which attempt at enforcing the prediction of an exact number of independent structures and at producing closer segmentation boundaries by focusing on a reduced number of samples.
arXiv Detail & Related papers (2021-11-25T10:11:41Z) - Video-based Remote Physiological Measurement via Cross-verified Feature
Disentangling [121.50704279659253]
We propose a cross-verified feature disentangling strategy to disentangle the physiological features with non-physiological representations.
We then use the distilled physiological features for robust multi-task physiological measurements.
The disentangled features are finally used for the joint prediction of multiple physiological signals like average HR values and r signals.
arXiv Detail & Related papers (2020-07-16T09:39:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.