Related papers: SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning

SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning

URL: http://arxiv.org/abs/2502.19668v1
Date: Thu, 27 Feb 2025 01:29:51 GMT
Title: SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning
Authors: Mingsheng Cai, Jiuming Jiang, Wenhao Huang, Che Liu, Rossella Arcucci,
Abstract summary: We propose $textbfSuPreME, a $textbfSu$pervised $textbfPre$-training framework for representation learning.<n>By using text-based cardiac queries instead of traditional categorical labels, SuPreME enables zero-shot classification of unseen diseases without additional fine-tuning.
Score: 8.831192046626251
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Cardiovascular diseases are a leading cause of death and disability worldwide. Electrocardiogram (ECG) recordings are critical for diagnosing and monitoring cardiac health, but obtaining large-scale annotated ECG datasets is labor-intensive and time-consuming. Recent ECG Self-Supervised Learning (eSSL) methods mitigate this by learning features without extensive labels but fail to capture fine-grained clinical semantics and require extensive task-specific fine-tuning. To address these challenges, we propose $\textbf{SuPreME}$, a $\textbf{Su}$pervised $\textbf{Pre}$-training framework for $\textbf{M}$ultimodal $\textbf{E}$CG representation learning. SuPreME applies Large Language Models (LLMs) to extract structured clinical entities from free-text ECG reports, filter out noise and irrelevant content, enhance clinical representation learning, and build a high-quality, fine-grained labeled dataset. By using text-based cardiac queries instead of traditional categorical labels, SuPreME enables zero-shot classification of unseen diseases without additional fine-tuning. We evaluate SuPreME on six downstream datasets covering 127 cardiac conditions, achieving superior zero-shot AUC performance over state-of-the-art eSSL and multimodal methods by over 1.96\%. Results demonstrate the effectiveness of SuPreME in leveraging structured, clinically relevant knowledge for high-quality ECG representations. All code and data will be released upon acceptance.

Related papers

TolerantECG: A Foundation Model for Imperfect Electrocardiogram [6.8878798499351]
TolerantECG is a foundation model for ECG signals that is robust to noise and capable of functioning with arbitrary subsets of the standard 12-lead ECG.<n>TolerantECG training combines contrastive and self-supervised learning frameworks to jointly learn ECG signal representations.<n> benchmarking results demonstrate that TolerantECG consistently ranks as the best or second-best performer across various ECG signal conditions.
arXiv Detail & Related papers (2025-07-14T03:48:35Z)
From Token to Rhythm: A Multi-Scale Approach for ECG-Language Pretraining [22.214252217020174]
We introduce MELP, a novel Multi-scale ECG-Language Pretraining (MELP) model that fully leverages hierarchical supervision from ECG-text pairs.<n>We evaluate MELP on three public ECG datasets across multiple tasks, including zero-shot ECG classification, linear probing, and transfer learning.
arXiv Detail & Related papers (2025-06-11T07:22:17Z)
Heartcare Suite: Multi-dimensional Understanding of ECG with Raw Multi-lead Signal Modeling [50.58126509704037]
Heartcare Suite is a framework for fine-grained electrocardiogram (ECG) understanding.<n>Heartcare-220K is a high-quality, structured, and comprehensive multimodal ECG dataset.<n>Heartcare-Bench is a benchmark to guide the optimization of Medical Multimodal Large Language Models (Med-MLLMs) in ECG scenarios.
arXiv Detail & Related papers (2025-06-06T07:56:41Z)
GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images [43.65650710265957]
We introduce GEM, the first MLLM unifying ECG time series, 12-lead ECG images and text for grounded and clinician-aligned ECG interpretation.<n> GEM enables feature-grounded analysis, evidence-driven reasoning, and a clinician-like diagnostic process through three core innovations.<n>We propose the Grounded ECG task, a clinically motivated benchmark designed to assess the MLLM's capability in grounded ECG understanding.
arXiv Detail & Related papers (2025-03-08T05:48:53Z)
FedCVD: The First Real-World Federated Learning Benchmark on Cardiovascular Disease Data [52.55123685248105]
Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment. Machine learning (ML) methods can help diagnose CVDs early, but their performance relies on access to substantial data with high quality. This paper presents the first real-world FL benchmark for cardiovascular disease detection, named FedCVD.
arXiv Detail & Related papers (2024-10-28T02:24:01Z)
Learning General Representation of 12-Lead Electrocardiogram with a Joint-Embedding Predictive Architecture [0.0]
We introduce ECG-JEPA, a self-supervised learning model for 12-lead ECG analysis.<n>It learns semantic representations of ECG data by predicting in the hidden latent space.<n> ECG-JEPA achieves state-of-the-art performance in various downstream tasks including ECG classification and feature prediction.
arXiv Detail & Related papers (2024-10-11T06:30:48Z)
Multimodal Variational Autoencoder for Low-cost Cardiac Hemodynamics Instability Detection [8.500041312027596]
We propose a novel variational autoencoder ($textCardioVAE_textX,G$) to integrate low-cost chest X-ray (CXR) and electrocardiogram (ECG) modalities with pre-training. Our model also excels in producing fine interpretations of predictions directly associated with clinical features.
arXiv Detail & Related papers (2024-03-20T15:06:49Z)
Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement [10.611952462532908]
Multimodal ECG Representation Learning (MERL) is capable of performing zero-shot ECG classification with text prompts. We propose the Clinical Knowledge Enhanced Prompt Engineering (CKEPE) approach to exploit external expert-verified clinical knowledge databases. MERL achieves an average AUC score of 75.2% in zero-shot classification (without training data), 3.2% higher than linear probed eSSL methods with 10% annotated training data, averaged across all six datasets.
arXiv Detail & Related papers (2024-03-11T12:28:55Z)
ETP: Learning Transferable ECG Representations via ECG-Text Pre-training [10.856365645831728]
ECG-Text Pre-training (ETP) is an innovative framework designed to learn cross-modal representations that link ECG signals with textual reports. ETP employs an ECG encoder along with a pre-trained language model to align ECG signals with their corresponding textual reports.
arXiv Detail & Related papers (2023-09-06T19:19:26Z)
Improving Multiple Sclerosis Lesion Segmentation Across Clinical Sites: A Federated Learning Approach with Noise-Resilient Training [75.40980802817349]
Deep learning models have shown promise for automatically segmenting MS lesions, but the scarcity of accurately annotated data hinders progress in this area. We introduce a Decoupled Hard Label Correction (DHLC) strategy that considers the imbalanced distribution and fuzzy boundaries of MS lesions. We also introduce a Centrally Enhanced Label Correction (CELC) strategy, which leverages the aggregated central model as a correction teacher for all sites.
arXiv Detail & Related papers (2023-08-31T00:36:10Z)
ECGBERT: Understanding Hidden Language of ECGs with Self-Supervised Representation Learning [6.0106590095197605]
ECGBERT is a self-supervised representation learning approach that unlocks the underlying language of ECGs. We demonstrate ECGBERT's potential to achieve state-of-the-art results on a wide variety of tasks.
arXiv Detail & Related papers (2023-06-10T04:23:08Z)
PulseNet: Deep Learning ECG-signal classification using random augmentation policy and continous wavelet transform for canines [46.09869227806991]
evaluating canine electrocardiograms (ECG) require skilled veterinarians. Current availability of veterinary cardiologists for ECG interpretation and diagnostic support is limited. We implement a deep convolutional neural network (CNN) approach for classifying canine electrocardiogram sequences as either normal or abnormal.
arXiv Detail & Related papers (2023-05-17T09:06:39Z)
Frozen Language Model Helps ECG Zero-Shot Learning [12.974685769614062]
We propose Multimodal ECG-Text Self-supervised pre-training (METS) We use a trainable ECG encoder and a frozen language model to embed paired ECG and automatically machine-generated clinical reports separately. In downstream classification tasks, METS achieves around 10% improvement in performance without using any annotated data.
arXiv Detail & Related papers (2023-03-22T05:01:14Z)
Self-supervised contrastive learning of echocardiogram videos enables label-efficient cardiac disease diagnosis [48.64462717254158]
We developed a self-supervised contrastive learning approach, EchoCLR, to catered to echocardiogram videos. When fine-tuned on small portions of labeled data, EchoCLR pretraining significantly improved classification performance for left ventricular hypertrophy (LVH) and aortic stenosis (AS) EchoCLR is unique in its ability to learn representations of medical videos and demonstrates that SSL can enable label-efficient disease classification from small, labeled datasets.
arXiv Detail & Related papers (2022-07-23T19:17:26Z)
Generalizing electrocardiogram delineation: training convolutional neural networks with synthetic data augmentation [63.51064808536065]
Existing databases for ECG delineation are small, being insufficient in size and in the array of pathological conditions they represent. This article delves has two main contributions. First, a pseudo-synthetic data generation algorithm was developed, based in probabilistically composing ECG traces given "pools" of fundamental segments, as cropped from the original databases, and a set of rules for their arrangement into coherent synthetic traces. Second, two novel segmentation-based loss functions have been developed, which attempt at enforcing the prediction of an exact number of independent structures and at producing closer segmentation boundaries by focusing on a reduced number of samples.
arXiv Detail & Related papers (2021-11-25T10:11:41Z)
Uncovering the structure of clinical EEG signals with self-supervised learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available. This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG) By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z)
ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings. We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework. The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.