Related papers: Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement

Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement

URL: http://arxiv.org/abs/2403.06659v3
Date: Tue, 2 Jul 2024 16:51:11 GMT
Title: Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement
Authors: Che Liu, Zhongwei Wan, Cheng Ouyang, Anand Shah, Wenjia Bai, Rossella Arcucci,
Abstract summary: Multimodal ECG Representation Learning (MERL) is capable of performing zero-shot ECG classification with text prompts. We propose the Clinical Knowledge Enhanced Prompt Engineering (CKEPE) approach to exploit external expert-verified clinical knowledge databases. MERL achieves an average AUC score of 75.2% in zero-shot classification (without training data), 3.2% higher than linear probed eSSL methods with 10% annotated training data, averaged across all six datasets.
Score: 10.611952462532908
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Electrocardiograms (ECGs) are non-invasive diagnostic tools crucial for detecting cardiac arrhythmic diseases in clinical practice. While ECG Self-supervised Learning (eSSL) methods show promise in representation learning from unannotated ECG data, they often overlook the clinical knowledge that can be found in reports. This oversight and the requirement for annotated samples for downstream tasks limit eSSL's versatility. In this work, we address these issues with the Multimodal ECG Representation Learning (MERL}) framework. Through multimodal learning on ECG records and associated reports, MERL is capable of performing zero-shot ECG classification with text prompts, eliminating the need for training data in downstream tasks. At test time, we propose the Clinical Knowledge Enhanced Prompt Engineering (CKEPE) approach, which uses Large Language Models (LLMs) to exploit external expert-verified clinical knowledge databases, generating more descriptive prompts and reducing hallucinations in LLM-generated content to boost zero-shot classification. Based on MERL, we perform the first benchmark across six public ECG datasets, showing the superior performance of MERL compared against eSSL methods. Notably, MERL achieves an average AUC score of 75.2% in zero-shot classification (without training data), 3.2% higher than linear probed eSSL methods with 10\% annotated training data, averaged across all six datasets. Code and models are available at https://github.com/cheliu-computation/MERL

Related papers

From Token to Rhythm: A Multi-Scale Approach for ECG-Language Pretraining [22.214252217020174]
We introduce MELP, a novel Multi-scale ECG-Language Pretraining (MELP) model that fully leverages hierarchical supervision from ECG-text pairs.<n>We evaluate MELP on three public ECG datasets across multiple tasks, including zero-shot ECG classification, linear probing, and transfer learning.
arXiv Detail & Related papers (2025-06-11T07:22:17Z)
SuPreME: A Supervised Pre-training Framework for Multimodal ECG Representation Learning [8.831192046626251]
We propose $textbfSuPreME, a $textbfSu$pervised $textbfPre$-training framework for representation learning. By using text-based cardiac queries instead of traditional categorical labels, SuPreME enables zero-shot classification of unseen diseases without additional fine-tuning.
arXiv Detail & Related papers (2025-02-27T01:29:51Z)
Electrocardiogram-Language Model for Few-Shot Question Answering with Meta Learning [19.513904491604794]
Electrocardiogram (ECG) interpretation requires specialized expertise. This work introduces a novel multimodal meta-learning method for few-shot ECG question answering.
arXiv Detail & Related papers (2024-10-18T13:48:01Z)
MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation [41.324530807795256]
Electrocardiogram (ECG) is the primary non-invasive diagnostic tool for monitoring cardiac conditions. Recent studies have concentrated on classifying cardiac conditions using ECG data but have overlooked ECG report generation. We propose the Multimodal ECG Instruction Tuning (MEIT) framework, the first attempt to tackle ECG report generation with LLMs and multimodal instructions.
arXiv Detail & Related papers (2024-03-07T23:20:56Z)
Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of Electrocardiogram [2.2842904085777045]
We introduce ST-MEM (S-Temporal Masked Electrocardiogram Modeling), designed to learntemporal features by reconstructing 12-lead ECG data. ST-MEM outperforms other SSL baseline methods in various experimental settings for arrhythmia.
arXiv Detail & Related papers (2024-02-02T10:04:13Z)
Improving Multiple Sclerosis Lesion Segmentation Across Clinical Sites: A Federated Learning Approach with Noise-Resilient Training [75.40980802817349]
Deep learning models have shown promise for automatically segmenting MS lesions, but the scarcity of accurately annotated data hinders progress in this area. We introduce a Decoupled Hard Label Correction (DHLC) strategy that considers the imbalanced distribution and fuzzy boundaries of MS lesions. We also introduce a Centrally Enhanced Label Correction (CELC) strategy, which leverages the aggregated central model as a correction teacher for all sites.
arXiv Detail & Related papers (2023-08-31T00:36:10Z)
Vision-Language Modelling For Radiological Imaging and Reports In The Low Data Regime [70.04389979779195]
This paper explores training medical vision-language models (VLMs) where the visual and language inputs are embedded into a common space. We explore several candidate methods to improve low-data performance, including adapting generic pre-trained models to novel image and text domains. Using text-to-image retrieval as a benchmark, we evaluate the performance of these methods with variable sized training datasets of paired chest X-rays and radiological reports.
arXiv Detail & Related papers (2023-03-30T18:20:00Z)
Frozen Language Model Helps ECG Zero-Shot Learning [12.974685769614062]
We propose Multimodal ECG-Text Self-supervised pre-training (METS) We use a trainable ECG encoder and a frozen language model to embed paired ECG and automatically machine-generated clinical reports separately. In downstream classification tasks, METS achieves around 10% improvement in performance without using any annotated data.
arXiv Detail & Related papers (2023-03-22T05:01:14Z)
Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning [5.675787521359948]
We propose SSL-based multimodality ECG classification. Our proposed network follows SSL learning paradigm and consists of two modules corresponding to pre-stream task, and down-stream task. To evaluate the effectiveness of our approach, ten-fold cross validation on the 12-lead PhysioNet 2020 dataset has been conducted.
arXiv Detail & Related papers (2022-09-30T18:45:34Z)
Self-supervised contrastive learning of echocardiogram videos enables label-efficient cardiac disease diagnosis [48.64462717254158]
We developed a self-supervised contrastive learning approach, EchoCLR, to catered to echocardiogram videos. When fine-tuned on small portions of labeled data, EchoCLR pretraining significantly improved classification performance for left ventricular hypertrophy (LVH) and aortic stenosis (AS) EchoCLR is unique in its ability to learn representations of medical videos and demonstrates that SSL can enable label-efficient disease classification from small, labeled datasets.
arXiv Detail & Related papers (2022-07-23T19:17:26Z)
Dissecting Self-Supervised Learning Methods for Surgical Computer Vision [51.370873913181605]
Self-Supervised Learning (SSL) methods have begun to gain traction in the general computer vision community. The effectiveness of SSL methods in more complex and impactful domains, such as medicine and surgery, remains limited and unexplored. We present an extensive analysis of the performance of these methods on the Cholec80 dataset for two fundamental and popular tasks in surgical context understanding, phase recognition and tool presence detection.
arXiv Detail & Related papers (2022-07-01T14:17:11Z)
Uncovering the structure of clinical EEG signals with self-supervised learning [64.4754948595556]
Supervised learning paradigms are often limited by the amount of labeled data that is available. This phenomenon is particularly problematic in clinically-relevant data, such as electroencephalography (EEG) By extracting information from unlabeled data, it might be possible to reach competitive performance with deep neural networks.
arXiv Detail & Related papers (2020-07-31T14:34:47Z)
ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings. We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework. The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.