anyECG-chat: A Generalist ECG-MLLM for Flexible ECG Input and Multi-Task Understanding
- URL: http://arxiv.org/abs/2506.00942v1
- Date: Sun, 01 Jun 2025 10:17:13 GMT
- Title: anyECG-chat: A Generalist ECG-MLLM for Flexible ECG Input and Multi-Task Understanding
- Authors: Haitao Li, Ziyu Li, Yiheng Mao, Ziyi Liu, Zhoujian Sun, Zhengxing Huang,
- Abstract summary: multimodal large language models (MLLMs) have sparked interest in their application to electrocardiogram (ECG) analysis.<n>Existing ECG-focused MLLMs primarily focus on report generation tasks, often limited to single 12-lead, short-duration (10s) ECG inputs.<n>We propose the anyECG-chat model, which supports dynamic-length ECG inputs and multiple ECG inputs.
- Score: 20.290531515033518
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The advent of multimodal large language models (MLLMs) has sparked interest in their application to electrocardiogram (ECG) analysis. However, existing ECG-focused MLLMs primarily focus on report generation tasks, often limited to single 12-lead, short-duration (10s) ECG inputs, thereby underutilizing the potential of MLLMs. To this end, we aim to develop a MLLM for ECG analysis that supports a broader range of tasks and more flexible ECG inputs. However, existing ECG-QA datasets are often monotonous. To address this gap, we first constructed the anyECG dataset, which encompasses a wide variety of tasks, including report generation, abnormal waveform localization, and open-ended question answering. In addition to standard hospital ECGs, we introduced long-duration reduced-lead ECGs for home environments and multiple ECG comparison scenarios commonly encountered in clinical practice. Furthermore, we propose the anyECG-chat model, which supports dynamic-length ECG inputs and multiple ECG inputs. We trained the model using a three-stage curriculum training recipe with the anyECG dataset. A comprehensive evaluation was conducted, demonstrating that anyECG-chat is capable of supporting various practical application scenarios, including not only common report generation tasks but also abnormal waveform localization for long-duration reduced-lead ECGs in home environments and comprehensive comparative analysis of multiple ECGs.
Related papers
- From Token to Rhythm: A Multi-Scale Approach for ECG-Language Pretraining [22.214252217020174]
We introduce MELP, a novel Multi-scale ECG-Language Pretraining (MELP) model that fully leverages hierarchical supervision from ECG-text pairs.<n>We evaluate MELP on three public ECG datasets across multiple tasks, including zero-shot ECG classification, linear probing, and transfer learning.
arXiv Detail & Related papers (2025-06-11T07:22:17Z) - Heartcare Suite: Multi-dimensional Understanding of ECG with Raw Multi-lead Signal Modeling [50.58126509704037]
Heartcare Suite is a framework for fine-grained electrocardiogram (ECG) understanding.<n>Heartcare-220K is a high-quality, structured, and comprehensive multimodal ECG dataset.<n>Heartcare-Bench is a benchmark to guide the optimization of Medical Multimodal Large Language Models (Med-MLLMs) in ECG scenarios.
arXiv Detail & Related papers (2025-06-06T07:56:41Z) - GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images [43.65650710265957]
We introduce GEM, the first MLLM unifying ECG time series, 12-lead ECG images and text for grounded and clinician-aligned ECG interpretation.<n> GEM enables feature-grounded analysis, evidence-driven reasoning, and a clinician-like diagnostic process through three core innovations.<n>We propose the Grounded ECG task, a clinically motivated benchmark designed to assess the MLLM's capability in grounded ECG understanding.
arXiv Detail & Related papers (2025-03-08T05:48:53Z) - Multimodal Lead-Specific Modeling of ECG for Low-Cost Pulmonary Hypertension Assessment [71.69065905466567]
Pulmonary hypertension (PH) is frequently underdiagnosed in low- and middle-income countries (LMICs) due to the scarcity of advanced diagnostic tools.<n>We propose Lead-Specific Electrocardiogram Multimodal Variational Autoencoder (LS-EMVAE), a model pre-trained on large-population 12L-ECG data.<n>LS-EMVAE makes better predictions on both 12L-ECG and 6L-ECG at inference, making it an equitable solution for areas with limited or no diagnostic tools.
arXiv Detail & Related papers (2025-03-03T16:16:38Z) - AnyECG: Foundational Models for Multitask Cardiac Analysis in Real-World Settings [34.078819572852446]
Electrocardiogram (ECG) is highly sensitive in detecting acute heart attacks.<n>This paper introduces AnyECG, a foundational model designed to extract robust representations from any real-world ECG data.
arXiv Detail & Related papers (2024-11-17T17:32:58Z) - Teach Multimodal LLMs to Comprehend Electrocardiographic Images [10.577263066644194]
We introduce ECGInstruct, a comprehensive ECG image instruction tuning dataset of over one million samples.
We also develop PULSE, an MLLM tailored for ECG image comprehension.
Our experiments show that PULSE sets a new state-of-the-art, outperforming general MLLMs with an average accuracy improvement of 15% to 30%.
arXiv Detail & Related papers (2024-10-21T20:26:41Z) - An Electrocardiogram Foundation Model Built on over 10 Million Recordings with External Evaluation across Multiple Domains [17.809094003643523]
ECG Foundation Model (ECGFounder) trained on over 10 million ECGs with 150 label categories from Harvard-Emory ECG Database.<n>ECGFounder achieves expert-level performance on internal validation sets, with AUROC exceeding 0.95 for eighty diagnoses.<n>When fine-tuned, ECGFounder outperforms baseline models in demographic analysis, clinical event detection, and cross-modality cardiac rhythm diagnosis.
arXiv Detail & Related papers (2024-10-05T12:12:02Z) - Electrocardiogram Report Generation and Question Answering via Retrieval-Augmented Self-Supervised Modeling [19.513904491604794]
ECG-ReGen is a retrieval-based approach for ECG-to-text report generation and question answering.
By combining pre-training with dynamic retrieval and Large Language Model (LLM)-based refinement, ECG-ReGen effectively analyzes ECG data and answers related queries.
arXiv Detail & Related papers (2024-09-13T12:50:36Z) - MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation [41.324530807795256]
Electrocardiogram (ECG) is the primary non-invasive diagnostic tool for monitoring cardiac conditions.
Recent studies have concentrated on classifying cardiac conditions using ECG data but have overlooked ECG report generation.
We propose the Multimodal ECG Instruction Tuning (MEIT) framework, the first attempt to tackle ECG report generation with LLMs and multimodal instructions.
arXiv Detail & Related papers (2024-03-07T23:20:56Z) - PulseNet: Deep Learning ECG-signal classification using random
augmentation policy and continous wavelet transform for canines [46.09869227806991]
evaluating canine electrocardiograms (ECG) require skilled veterinarians.
Current availability of veterinary cardiologists for ECG interpretation and diagnostic support is limited.
We implement a deep convolutional neural network (CNN) approach for classifying canine electrocardiogram sequences as either normal or abnormal.
arXiv Detail & Related papers (2023-05-17T09:06:39Z) - Generalizing electrocardiogram delineation: training convolutional
neural networks with synthetic data augmentation [63.51064808536065]
Existing databases for ECG delineation are small, being insufficient in size and in the array of pathological conditions they represent.
This article delves has two main contributions. First, a pseudo-synthetic data generation algorithm was developed, based in probabilistically composing ECG traces given "pools" of fundamental segments, as cropped from the original databases, and a set of rules for their arrangement into coherent synthetic traces.
Second, two novel segmentation-based loss functions have been developed, which attempt at enforcing the prediction of an exact number of independent structures and at producing closer segmentation boundaries by focusing on a reduced number of samples.
arXiv Detail & Related papers (2021-11-25T10:11:41Z) - ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed
Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings.
We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework.
The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.