ECG-FM: An Open Electrocardiogram Foundation Model
- URL: http://arxiv.org/abs/2408.05178v2
- Date: Fri, 30 May 2025 15:29:06 GMT
- Title: ECG-FM: An Open Electrocardiogram Foundation Model
- Authors: Kaden McKeen, Sameer Masood, Augustin Toma, Barry Rubin, Bo Wang,
- Abstract summary: We present ECG-FM, an open foundation model for ECG analysis, and conduct a study using a dataset of 1.5 million ECGs.<n>ECG-FM is a transformer-based model pretrained using a hybrid contrastive and generative self-supervised learning approach.<n>We affirm that ECG-FM is robust, label-efficient, and functionally discriminative by showcasing data scaling experiments, performing a latent space analysis, and generating saliency maps.
- Score: 3.8270632390229777
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Conventional task-specific electrocardiogram (ECG) analysis models require large annotated datasets to train. Foundation models mitigate this burden by leveraging self-supervised pretraining; however, the scarcity of open-weight ECG foundation models hinders adoption and cross-study comparability. We present ECG-FM, an open foundation model for ECG analysis, and conduct a study using a dataset of 1.5 million ECGs. ECG-FM is a transformer-based model pretrained using a hybrid contrastive and generative self-supervised learning approach. Our downstream tasks include predicting reduced left ventricular ejection fraction (LVEF) and ECG interpretation labels, where we release a benchmark task on the MIMIC-IV-ECG dataset. We affirm that ECG-FM is robust, label-efficient, and functionally discriminative by showcasing data scaling experiments, performing a latent space analysis, and generating saliency maps. ECG-FM markedly outperforms task-specific models in the small-to-medium-scale data regime and demonstrates cross-dataset generalizability, achieving high AUROC on many clinically salient labels such as atrial fibrillation (0.996) and LVEF<=40% (0.929). We release our code, model weights, and benchmark task at https://github.com/bowang-lab/ECG-FM/.
Related papers
- Sensing Cardiac Health Across Scenarios and Devices: A Multi-Modal Foundation Model Pretrained on Heterogeneous Data from 1.7 Million Individuals [36.08910150609342]
We present a cardiac sensing foundation model (CSFM) that learns unified representations from vast, heterogeneous health records.<n>Our model is pretrained on an innovative multi-modal integration of data from multiple large-scale datasets.<n> CSFM consistently outperforms traditional one-modal-one-task approaches.
arXiv Detail & Related papers (2025-06-23T20:58:12Z) - GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images [43.65650710265957]
We introduce GEM, the first MLLM unifying ECG time series, 12-lead ECG images and text for grounded and clinician-aligned ECG interpretation.
GEM enables feature-grounded analysis, evidence-driven reasoning, and a clinician-like diagnostic process through three core innovations.
We propose the Grounded ECG task, a clinically motivated benchmark designed to assess the MLLM's capability in grounded ECG understanding.
arXiv Detail & Related papers (2025-03-08T05:48:53Z) - OpenECG: Benchmarking ECG Foundation Models with Public 1.2 Million Records [2.3942438969883906]
This study introduces OpenECG, a large-scale benchmark of 1.2 million 12-lead ECG recordings from nine centers to evaluate ECG foundation models (ECG-FMs) trained on public datasets.<n>We investigate three self-supervised learning methods (SimCLR, BYOL, MAE) with ResNet-50 and Vision Transformer architectures, assessing model generalization through leave-one-dataset-out experiments and data scaling analysis.<n>Results show that pre-training on diverse datasets significantly improves generalization, with BYOL and MAE outperforming SimCLR, highlighting the efficacy of feature-consistency and generative learning over contrast
arXiv Detail & Related papers (2025-03-02T03:26:14Z) - DiffuSETS: 12-lead ECG Generation Conditioned on Clinical Text Reports and Patient-Specific Information [13.680337221159506]
Heart disease remains a significant threat to human health.
Scarcity of high-quality ECG data, driven by privacy concerns and limited medical resources, creates a pressing need for effective ECG signal generation.
We propose DiffuSETS, a novel framework capable of generating ECG signals with high semantic alignment and fidelity.
arXiv Detail & Related papers (2025-01-10T12:55:34Z) - Learning General Representation of 12-Lead Electrocardiogram with a Joint-Embedding Predictive Architecture [0.0]
We introduce ECG-JEPA, a self-supervised learning model for 12-lead ECG analysis.
It learns semantic representations of ECG data by predicting in the hidden latent space.
ECG-JEPA achieves state-of-the-art performance in various downstream tasks including ECG classification and feature prediction.
arXiv Detail & Related papers (2024-10-11T06:30:48Z) - Self-supervised inter-intra period-aware ECG representation learning for detecting atrial fibrillation [41.82319894067087]
We propose an inter-intra period-aware ECG representation learning approach.
Considering ECGs of atrial fibrillation patients exhibit the irregularity in RR intervals and the absence of P-waves, we develop specific pre-training tasks for interperiod and intraperiod representations.
Our approach demonstrates remarkable AUC performances on the BTCH dataset, textiti.e., 0.953/0.996 for paroxysmal/persistent atrial fibrillation detection.
arXiv Detail & Related papers (2024-10-08T10:03:52Z) - An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains [17.809094003643523]
We introduce an ECG Foundation Model (ECGFounder) to broaden the diagnostic capabilities of ECG analysis.
ECGFounder was trained on over 10 million ECGs with 150 label categories from the Harvard-Emory ECG Database.
It achieves expert-level performance on internal validation sets, with AUROC exceeding 0.95 for eighty diagnoses.
arXiv Detail & Related papers (2024-10-05T12:12:02Z) - ECG Semantic Integrator (ESI): A Foundation ECG Model Pretrained with LLM-Enhanced Cardiological Text [14.06147507373525]
This study introduces a new multimodal contrastive pretaining framework that aims to improve the quality and robustness of learned representations of 12-lead ECG signals.
Our framework comprises two key components, including Cardio Query Assistant (CQA) and ECG Semantics Integrator(ESI)
arXiv Detail & Related papers (2024-05-26T06:45:39Z) - MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation [41.324530807795256]
Electrocardiogram (ECG) is the primary non-invasive diagnostic tool for monitoring cardiac conditions.
Recent studies have concentrated on classifying cardiac conditions using ECG data but have overlooked ECG report generation.
We propose the Multimodal ECG Instruction Tuning (MEIT) framework, the first attempt to tackle ECG report generation with LLMs and multimodal instructions.
arXiv Detail & Related papers (2024-03-07T23:20:56Z) - Improving Diffusion Models for ECG Imputation with an Augmented Template
Prior [43.6099225257178]
noisy and poor-quality recordings are a major issue for signals collected using mobile health systems.
Recent studies have explored the imputation of missing values in ECG with probabilistic time-series models.
We present a template-guided denoising diffusion probabilistic model (DDPM), PulseDiff, which is conditioned on an informative prior for a range of health conditions.
arXiv Detail & Related papers (2023-10-24T11:34:15Z) - PulseNet: Deep Learning ECG-signal classification using random
augmentation policy and continous wavelet transform for canines [46.09869227806991]
evaluating canine electrocardiograms (ECG) require skilled veterinarians.
Current availability of veterinary cardiologists for ECG interpretation and diagnostic support is limited.
We implement a deep convolutional neural network (CNN) approach for classifying canine electrocardiogram sequences as either normal or abnormal.
arXiv Detail & Related papers (2023-05-17T09:06:39Z) - HeartBEiT: Vision Transformer for Electrocardiogram Data Improves
Diagnostic Performance at Low Sample Sizes [28.88454028731653]
We create the first vision-based transformer model, HeartBEiT, for electrocardiogram waveform analysis.
We show that HeartBEiT has significantly higher performance at lower sample sizes compared to other models.
We also show that HeartBEiT improves explainability of diagnosis by highlighting biologically relevant regions of the EKG vs. standard CNNs.
arXiv Detail & Related papers (2022-12-13T16:39:21Z) - SEVGGNet-LSTM: a fused deep learning model for ECG classification [38.747030782394646]
The input ECG signals are firstly segmented and normalized, and then fed into the combined VGG and LSTM network for feature extraction and classification.
An attention mechanism (SE block) is embedded into the core network for increasing the weight of important features.
arXiv Detail & Related papers (2022-10-31T07:36:48Z) - Leveraging Statistical Shape Priors in GAN-based ECG Synthesis [3.3482093430607267]
We propose a novel approach for ECG signal generation using Generative Adversarial Networks (GANs) and statistical ECG data modeling.
Our approach leverages prior knowledge about ECG dynamics to synthesize realistic signals, addressing the complex dynamics of ECG signals.
Our results demonstrate that our approach, which models temporal and amplitude variations of ECG signals as 2-D shapes, generates more realistic signals compared to state-of-the-art GAN based generation baselines.
arXiv Detail & Related papers (2022-10-22T18:06:11Z) - ME-GAN: Learning Panoptic Electrocardio Representations for Multi-view
ECG Synthesis Conditioned on Heart Diseases [24.52989747071257]
We propose a disease-aware generative adversarial network for multi-view ECG synthesis called ME-GAN.
Since ECG manifestations of heart diseases are often localized in specific waveforms, we propose a new "mixup normalization" to inject disease information precisely into suitable locations.
Comprehensive experiments verify that our ME-GAN performs well on multi-view ECG signal synthesis with trusty morbid manifestations.
arXiv Detail & Related papers (2022-07-21T14:14:02Z) - Generalizing electrocardiogram delineation: training convolutional
neural networks with synthetic data augmentation [63.51064808536065]
Existing databases for ECG delineation are small, being insufficient in size and in the array of pathological conditions they represent.
This article delves has two main contributions. First, a pseudo-synthetic data generation algorithm was developed, based in probabilistically composing ECG traces given "pools" of fundamental segments, as cropped from the original databases, and a set of rules for their arrangement into coherent synthetic traces.
Second, two novel segmentation-based loss functions have been developed, which attempt at enforcing the prediction of an exact number of independent structures and at producing closer segmentation boundaries by focusing on a reduced number of samples.
arXiv Detail & Related papers (2021-11-25T10:11:41Z) - ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed
Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings.
We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework.
The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z) - Opportunities and Challenges of Deep Learning Methods for
Electrocardiogram Data: A Systematic Review [62.490310870300746]
The electrocardiogram (ECG) is one of the most commonly used diagnostic tools in medicine and healthcare.
Deep learning methods have achieved promising results on predictive healthcare tasks using ECG signals.
This paper presents a systematic review of deep learning methods for ECG data from both modeling and application perspectives.
arXiv Detail & Related papers (2019-12-28T02:44:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.