Multimodal Lead-Specific Modeling of ECG for Low-Cost Pulmonary Hypertension Assessment
- URL: http://arxiv.org/abs/2503.13470v1
- Date: Mon, 03 Mar 2025 16:16:38 GMT
- Title: Multimodal Lead-Specific Modeling of ECG for Low-Cost Pulmonary Hypertension Assessment
- Authors: Mohammod N. I. Suvon, Shuo Zhou, Prasun C. Tripathi, Wenrui Fan, Samer Alabed, Bishesh Khanal, Venet Osmani, Andrew J. Swift, Chen, Chen, Haiping Lu,
- Abstract summary: Pulmonary hypertension (PH) is frequently underdiagnosed in low- and middle-income countries (LMICs) due to the scarcity of advanced diagnostic tools.<n>We propose Lead-Specific Electrocardiogram Multimodal Variational Autoencoder (LS-EMVAE), a model pre-trained on large-population 12L-ECG data.<n>LS-EMVAE makes better predictions on both 12L-ECG and 6L-ECG at inference, making it an equitable solution for areas with limited or no diagnostic tools.
- Score: 71.69065905466567
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pulmonary hypertension (PH) is frequently underdiagnosed in low- and middle-income countries (LMICs) primarily due to the scarcity of advanced diagnostic tools. Several studies in PH have applied machine learning to low-cost diagnostic tools like 12-lead ECG (12L-ECG), but they mainly focus on areas with limited resources, overlooking areas with no diagnostic tools, such as rural primary healthcare in LMICs. Recent studies have shown the effectiveness of 6-lead ECG (6L-ECG), as a cheaper and portable alternative in detecting various cardiac conditions, but its clinical value for PH detection is not well proved. Furthermore, existing methods treat 12L-/6L-ECG as a single modality, capturing only shared features while overlooking lead-specific features essential for identifying complex cardiac hemodynamic changes. In this paper, we propose Lead-Specific Electrocardiogram Multimodal Variational Autoencoder (LS-EMVAE), a model pre-trained on large-population 12L-ECG data and fine-tuned on task-specific data (12L-ECG or 6L-ECG). LS-EMVAE models each 12L-ECG lead as a separate modality and introduces a hierarchical expert composition using Mixture and Product of Experts for adaptive latent feature fusion between lead-specific and shared features. Unlike existing approaches, LS-EMVAE makes better predictions on both 12L-ECG and 6L-ECG at inference, making it an equitable solution for areas with limited or no diagnostic tools. We pre-trained LS-EMVAE on 800,000 publicly available 12L-ECG samples and fine-tuned it for two tasks: 1) PH detection and 2) phenotyping pre-/post-capillary PH, on in-house datasets of 892 and 691 subjects across 12L-ECG and 6L-ECG settings. Extensive experiments show that LS-EMVAE outperforms existing baselines in both ECG settings, while 6L-ECG achieves performance comparable to 12L-ECG, unlocking its potential for global PH screening in areas without diagnostic tools.
Related papers
- CLEF: Clinically-Guided Contrastive Learning for Electrocardiogram Foundation Models [13.613519337591507]
Single-lead ECG recording is integrated into both clinical-grade and consumer wearables.<n>While self-supervised pretraining of foundation models on unlabeled ECGs improves diagnostic performance, existing approaches do not incorporate domain knowledge from clinical metadata.<n>We introduce a novel contrastive learning approach that utilizes an established clinical risk score to adaptively weight negative pairs.
arXiv Detail & Related papers (2025-12-01T20:21:44Z) - Simulator and Experience Enhanced Diffusion Model for Comprehensive ECG Generation [52.19347532840774]
We propose SE-Diff, a novel physiological simulator and experience enhanced diffusion model for ECG generation.<n> SE-Diff integrates a lightweight ordinary differential equation (ODE)-based ECG simulator into the diffusion process via a beat decoder.<n>Extensive experiments on real-world ECG datasets demonstrate that SE-Diff improves both signal fidelity and text-ECG semantic alignment.
arXiv Detail & Related papers (2025-11-13T02:57:10Z) - A Benchmark Study of Deep Learning Methods for Multi-Label Pediatric Electrocardiogram-Based Cardiovascular Disease Classification [0.0]
This paper presents the first benchmark study of deep learning for multi-label pediatric CVD classification on the recently released ZZU-pECG dataset.<n>We evaluate four representative paradigms--ResNet-1D, BiLSTM, Transformer, and Mamba 2--under both 9-lead and 12-lead configurations.<n>All models achieved strong results, with Hamming Loss as low as 0.0069 and F1-scores above 85% in most settings.
arXiv Detail & Related papers (2025-10-04T11:08:46Z) - EchoBench: Benchmarking Sycophancy in Medical Large Vision-Language Models [82.43729208063468]
Recent benchmarks for medical Large Vision-Language Models (LVLMs) emphasize leaderboard accuracy, overlooking reliability and safety.<n>We study sycophancy -- models' tendency to uncritically echo user-provided information.<n>We introduce EchoBench, a benchmark to systematically evaluate sycophancy in medical LVLMs.
arXiv Detail & Related papers (2025-09-24T14:09:55Z) - Explainable AI (XAI) for Arrhythmia detection from electrocardiograms [0.0]
Deep learning has enabled highly accurate arrhythmia detection from electrocardiogram (ECG) signals, but limited interpretability remains a barrier to clinical adoption.<n>This study investigates the application of Explainable AI (XAI) techniques specifically adapted for time-series ECG analysis.
arXiv Detail & Related papers (2025-08-24T10:44:24Z) - Sensing Cardiac Health Across Scenarios and Devices: A Multi-Modal Foundation Model Pretrained on Heterogeneous Data from 1.7 Million Individuals [36.08910150609342]
We present a cardiac sensing foundation model (CSFM) that learns unified representations from vast, heterogeneous health records.<n>Our model is pretrained on an innovative multi-modal integration of data from multiple large-scale datasets.<n> CSFM consistently outperforms traditional one-modal-one-task approaches.
arXiv Detail & Related papers (2025-06-23T20:58:12Z) - Heartcare Suite: Multi-dimensional Understanding of ECG with Raw Multi-lead Signal Modeling [50.58126509704037]
Heartcare Suite is a framework for fine-grained electrocardiogram (ECG) understanding.<n>Heartcare-220K is a high-quality, structured, and comprehensive multimodal ECG dataset.<n>Heartcare-Bench is a benchmark to guide the optimization of Medical Multimodal Large Language Models (Med-MLLMs) in ECG scenarios.
arXiv Detail & Related papers (2025-06-06T07:56:41Z) - xLSTM-ECG: Multi-label ECG Classification via Feature Fusion with xLSTM [14.02717596836022]
We propose xLSTM-ECG, a novel approach for multi-label classification of ECG signals.
To the best of our knowledge, this work represents the first design and application of xLSTM modules specifically adapted for multi-label ECG classification.
arXiv Detail & Related papers (2025-04-14T16:12:46Z) - GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images [44.50428701650495]
We introduce GEM, the first MLLM unifying ECG time series, 12-lead ECG images and text for grounded and clinician-aligned ECG interpretation.<n> GEM enables feature-grounded analysis, evidence-driven reasoning, and a clinician-like diagnostic process through three core innovations.<n>We propose the Grounded ECG task, a clinically motivated benchmark designed to assess the MLLM's capability in grounded ECG understanding.
arXiv Detail & Related papers (2025-03-08T05:48:53Z) - Conditional Electrocardiogram Generation Using Hierarchical Variational Autoencoders [0.0]
We propose a conditional Nouveau VAE model for ECG signal generation (cNVAE-ECG)
This paper proposes the publicly available conditional Nouveau VAE model for ECG signal generation (cNVAE-ECG)
arXiv Detail & Related papers (2025-03-03T13:30:36Z) - An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains [17.809094003643523]
We introduce an ECG Foundation Model (ECGFounder) to broaden the diagnostic capabilities of ECG analysis.
ECGFounder was trained on over 10 million ECGs with 150 label categories from the Harvard-Emory ECG Database.
It achieves expert-level performance on internal validation sets, with AUROC exceeding 0.95 for eighty diagnoses.
arXiv Detail & Related papers (2024-10-05T12:12:02Z) - Multi-Channel Masked Autoencoder and Comprehensive Evaluations for Reconstructing 12-Lead ECG from Arbitrary Single-Lead ECG [19.74009541199362]
This study proposes a multi-channel masked autoencoder (MCMA) for reconstructing 12-Lead ECG from arbitrary single-lead ECG.
In the signal-level evaluation, the mean square errors of 0.0317 and 0.1034, Pearson correlation coefficients of 0.7885 and 0.7420.
In the feature-level evaluation, the average standard deviation of the mean heart rate across the generated 12-lead ECG is 1.0481, the coefficient of variation is 1.58%, and the range is 3.2874.
arXiv Detail & Related papers (2024-07-16T08:17:45Z) - Foundation Models for ECG: Leveraging Hybrid Self-Supervised Learning for Advanced Cardiac Diagnostics [2.948318253609515]
Using foundation models enhanced by self-supervised learning (SSL) methods presents an innovative approach to electrocardiogram (ECG) analysis.
This study comprehensively evaluates foundation models for ECGs, leveraging SSL methods, including generative and contrastive learning.
We developed a Hybrid Learning (HL) for foundation models that improve the precision and reliability of cardiac diagnostics.
arXiv Detail & Related papers (2024-06-26T02:24:13Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - Zero-Shot ECG Classification with Multimodal Learning and Test-time Clinical Knowledge Enhancement [10.611952462532908]
Multimodal ECG Representation Learning (MERL) is capable of performing zero-shot ECG classification with text prompts.
We propose the Clinical Knowledge Enhanced Prompt Engineering (CKEPE) approach to exploit external expert-verified clinical knowledge databases.
MERL achieves an average AUC score of 75.2% in zero-shot classification (without training data), 3.2% higher than linear probed eSSL methods with 10% annotated training data, averaged across all six datasets.
arXiv Detail & Related papers (2024-03-11T12:28:55Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - A Dual-scale Lead-seperated Transformer With Lead-orthogonal Attention
And Meta-information For Ecg Classification [26.07181634056045]
This work proposes a dual-scale lead-separated transformer with lead-orthogonal attention and meta-information (DLTM-ECG)
ECG segments are interpreted as independent patches, and together with the reduced dimension signal, they form a dual-scale representation.
Our work has the potential for similar multichannel bioelectrical signal processing and physiological multimodal tasks.
arXiv Detail & Related papers (2022-11-23T08:45:34Z) - Generalizing electrocardiogram delineation: training convolutional
neural networks with synthetic data augmentation [63.51064808536065]
Existing databases for ECG delineation are small, being insufficient in size and in the array of pathological conditions they represent.
This article delves has two main contributions. First, a pseudo-synthetic data generation algorithm was developed, based in probabilistically composing ECG traces given "pools" of fundamental segments, as cropped from the original databases, and a set of rules for their arrangement into coherent synthetic traces.
Second, two novel segmentation-based loss functions have been developed, which attempt at enforcing the prediction of an exact number of independent structures and at producing closer segmentation boundaries by focusing on a reduced number of samples.
arXiv Detail & Related papers (2021-11-25T10:11:41Z) - Identification of Ischemic Heart Disease by using machine learning
technique based on parameters measuring Heart Rate Variability [50.591267188664666]
In this study, 18 non-invasive features (age, gender, left ventricular ejection fraction and 15 obtained from HRV) of 243 subjects were used to train and validate a series of several ANN.
The best result was obtained using 7 input parameters and 7 hidden nodes with an accuracy of 98.9% and 82% for the training and validation dataset.
arXiv Detail & Related papers (2020-10-29T19:14:41Z) - Interpretable Deep Learning for Automatic Diagnosis of 12-lead
Electrocardiogram [15.464768773761527]
We developed a deep neural network for multi-label classification of cardiac arrhythmias in 12-lead ECG recordings.
The proposed model achieved an average area under the receiver operating characteristic curve (AUC) of 0.970 and an average F1 score of 0.813.
The best-performing leads are lead I, aVR, and V5 among 12 leads.
arXiv Detail & Related papers (2020-10-20T14:51:00Z) - Performance of Dual-Augmented Lagrangian Method and Common Spatial
Patterns applied in classification of Motor-Imagery BCI [68.8204255655161]
Motor-imagery based brain-computer interfaces (MI-BCI) have the potential to become ground-breaking technologies for neurorehabilitation.
Due to the noisy nature of the used EEG signal, reliable BCI systems require specialized procedures for features optimization and extraction.
arXiv Detail & Related papers (2020-10-13T20:50:13Z) - ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed
Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings.
We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework.
The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.