An Electrocardiogram Multi-task Benchmark with Comprehensive Evaluations and Insightful Findings
- URL: http://arxiv.org/abs/2512.08954v1
- Date: Fri, 28 Nov 2025 06:47:21 GMT
- Title: An Electrocardiogram Multi-task Benchmark with Comprehensive Evaluations and Insightful Findings
- Authors: Yuhao Xu, Jiaying Lu, Sirui Ding, Defu Cao, Xiao Hu, Carl Yang,
- Abstract summary: Analyzing the ECG typically requires domain expertise, which is a roadblock to applying artificial intelligence for healthcare.<n>We evaluate language/general time-series/ECG foundation models in comparison with time-series deep learning models.<n>In-depth analyses and insights are provided along with comprehensive experimental results.
- Score: 21.836042030973797
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the process of patient diagnosis, non-invasive measurements are widely used due to their low risks and quick results. Electrocardiogram (ECG), as a non-invasive method to collect heart activities, is used to diagnose cardiac conditions. Analyzing the ECG typically requires domain expertise, which is a roadblock to applying artificial intelligence (AI) for healthcare. Through advances in self-supervised learning and foundation models, AI systems can now acquire and leverage domain knowledge without relying solely on human expertise. However, there is a lack of comprehensive analyses over the foundation models' performance on ECG. This study aims to answer the research question: "Are Foundation Models Useful for ECG Analysis?" To address it, we evaluate language/general time-series/ECG foundation models in comparison with time-series deep learning models. The experimental results show that general time-series/ECG foundation models achieve a top performance rate of 80%, indicating their effectiveness in ECG analysis. In-depth analyses and insights are provided along with comprehensive experimental results. This study highlights the limitations and potential of foundation models in advancing physiological waveform analysis. The data and code for this benchmark are publicly available at https://github.com/yuhaoxu99/ECGMultitasks-Benchmark.
Related papers
- Detecting Structural Heart Disease from Electrocardiograms via a Generalized Additive Model of Interpretable Foundation-Model Predictors [8.817617912039616]
Structural heart disease (SHD) is a prevalent condition with many undiagnosed cases.<n>Recent studies show that artificial intelligence (AI)-based analysis of electrocardiograms (ECGs) can detect SHD.<n>Existing methods are fully black-box models, limiting interpretability and clinical adoption.
arXiv Detail & Related papers (2026-03-03T05:39:32Z) - Looking Beyond Accuracy: A Holistic Benchmark of ECG Foundation Models [0.3914676152740142]
This study aims to find an in-depth, comprehensive benchmarking framework for Foundation Models (FMs)<n>We introduce a benchmark methodology that complements performance-based evaluation with representation-level analysis.<n>We also rely on the methodology for carrying out an extensive evaluation of several ECG-expert FMs pretrained via state-of-the-art techniques.
arXiv Detail & Related papers (2026-01-29T15:14:00Z) - EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model [46.84040404474695]
EnECG is an ensemble-based framework that integrates multiple specialized foundation models, each excelling in different aspects of ECG interpretation.<n>We show that EnECG can help reduce computational and memory costs while maintaining the strong representational power of foundation models.<n>This framework not only enhances feature extraction and predictive performance but also ensures practical efficiency for real-world clinical applications.
arXiv Detail & Related papers (2025-11-28T07:22:33Z) - Simulator and Experience Enhanced Diffusion Model for Comprehensive ECG Generation [52.19347532840774]
We propose SE-Diff, a novel physiological simulator and experience enhanced diffusion model for ECG generation.<n> SE-Diff integrates a lightweight ordinary differential equation (ODE)-based ECG simulator into the diffusion process via a beat decoder.<n>Extensive experiments on real-world ECG datasets demonstrate that SE-Diff improves both signal fidelity and text-ECG semantic alignment.
arXiv Detail & Related papers (2025-11-13T02:57:10Z) - Benchmarking ECG Foundational Models: A Reality Check Across Clinical Tasks [1.6873748786804317]
Foundation models promise broader adaptability, but their generalization across diverse ECG tasks is not well understood.<n>We benchmarked eight ECG foundation models on 26 clinically relevant tasks using 12 public datasets.<n>While foundation models show promise for adult ECG analysis, substantial gaps remain in cardiac structure, outcome prediction, and patient characterization.
arXiv Detail & Related papers (2025-09-29T17:29:48Z) - EEG-MedRAG: Enhancing EEG-based Clinical Decision-Making via Hierarchical Hypergraph Retrieval-Augmented Generation [45.031633614714]
EEG-MedRAG is a three-layer hypergraph-based retrieval-augmented generation framework.<n>It unifies EEG domain knowledge, individual patient cases, and a large-scale repository into a traversable n-ary relational hypergraph.<n>We introduce the first cross-disease, cross-role EEG clinical QA benchmark, spanning seven disorders and five authentic clinical perspectives.
arXiv Detail & Related papers (2025-08-19T11:12:58Z) - A Comprehensive Benchmark for Electrocardiogram Time-Series [31.656774120734358]
Electrocardiogram is crucial for assessing cardiac health and diagnosing various diseases.<n>ECG data is often incorporated into pre-training datasets for large-scale time-series model training.
arXiv Detail & Related papers (2025-07-15T02:54:24Z) - Self-supervised inter-intra period-aware ECG representation learning for detecting atrial fibrillation [41.82319894067087]
We propose an inter-intra period-aware ECG representation learning approach.
Considering ECGs of atrial fibrillation patients exhibit the irregularity in RR intervals and the absence of P-waves, we develop specific pre-training tasks for interperiod and intraperiod representations.
Our approach demonstrates remarkable AUC performances on the BTCH dataset, textiti.e., 0.953/0.996 for paroxysmal/persistent atrial fibrillation detection.
arXiv Detail & Related papers (2024-10-08T10:03:52Z) - An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains [17.809094003643523]
ECG Foundation Model (ECGFounder) trained on over 10 million ECGs with 150 label categories from Harvard-Emory ECG Database.<n>ECGFounder achieves expert-level performance on internal validation sets, with AUROC exceeding 0.95 for eighty diagnoses.<n>When fine-tuned, ECGFounder outperforms baseline models in demographic analysis, clinical event detection, and cross-modality cardiac rhythm diagnosis.
arXiv Detail & Related papers (2024-10-05T12:12:02Z) - CREMA: A Contrastive Regularized Masked Autoencoder for Robust ECG Diagnostics across Clinical Domains [2.9143698739149615]
We present CREMA, a foundation model for 12-lead ECGs designed to learn generalizable representations through self-supervised pretraining.<n> CREMA combines generative learning and contrastive regularization via a Contrastive Regularized MAE loss, and employs a Signal Transformer (SiT) architecture to capture both local waveform details and global temporal dependencies.
arXiv Detail & Related papers (2024-06-26T02:24:13Z) - Prospects for AI-Enhanced ECG as a Unified Screening Tool for Cardiac and Non-Cardiac Conditions -- An Explorative Study in Emergency Care [0.9503773054285559]
We investigate the capability of a single model to predict a diverse range of both cardiac and non-cardiac discharge diagnoses based on a sole ECG collected in the emergency department.
We find that 253, 81 cardiac, and 172 non-cardiac, ICD codes can be reliably predicted in the sense of exceeding an AUROC score of 0.8 in a statistically significant manner.
arXiv Detail & Related papers (2023-12-18T09:29:42Z) - ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed
Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings.
We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework.
The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z) - Opportunities and Challenges of Deep Learning Methods for
Electrocardiogram Data: A Systematic Review [62.490310870300746]
The electrocardiogram (ECG) is one of the most commonly used diagnostic tools in medicine and healthcare.
Deep learning methods have achieved promising results on predictive healthcare tasks using ECG signals.
This paper presents a systematic review of deep learning methods for ECG data from both modeling and application perspectives.
arXiv Detail & Related papers (2019-12-28T02:44:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.