Sensing Cardiac Health Across Scenarios and Devices: A Multi-Modal Foundation Model Pretrained on Heterogeneous Data from 1.7 Million Individuals
- URL: http://arxiv.org/abs/2507.01045v1
- Date: Mon, 23 Jun 2025 20:58:12 GMT
- Title: Sensing Cardiac Health Across Scenarios and Devices: A Multi-Modal Foundation Model Pretrained on Heterogeneous Data from 1.7 Million Individuals
- Authors: Xiao Gu, Wei Tang, Jinpei Han, Veer Sangha, Fenglin Liu, Shreyank N Gowda, Antonio H. Ribeiro, Patrick Schwab, Kim Branson, Lei Clifton, Antonio Luiz P. Ribeiro, Zhangdaihong Liu, David A. Clifton,
- Abstract summary: We present a cardiac sensing foundation model (CSFM) that learns unified representations from vast, heterogeneous health records.<n>Our model is pretrained on an innovative multi-modal integration of data from multiple large-scale datasets.<n> CSFM consistently outperforms traditional one-modal-one-task approaches.
- Score: 36.08910150609342
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cardiac biosignals, such as electrocardiograms (ECG) and photoplethysmograms (PPG), are of paramount importance for the diagnosis, prevention, and management of cardiovascular diseases, and have been extensively used in a variety of clinical tasks. Conventional deep learning approaches for analyzing these signals typically rely on homogeneous datasets and static bespoke models, limiting their robustness and generalizability across diverse clinical settings and acquisition protocols. In this study, we present a cardiac sensing foundation model (CSFM) that leverages advanced transformer architectures and a generative, masked pretraining strategy to learn unified representations from vast, heterogeneous health records. Our model is pretrained on an innovative multi-modal integration of data from multiple large-scale datasets (including MIMIC-III-WDB, MIMIC-IV-ECG, and CODE), comprising cardiac signals and the corresponding clinical or machine-generated text reports from approximately 1.7 million individuals. We demonstrate that the embeddings derived from our CSFM not only serve as effective feature extractors across diverse cardiac sensing scenarios, but also enable seamless transfer learning across varying input configurations and sensor modalities. Extensive evaluations across diagnostic tasks, demographic information recognition, vital sign measurement, clinical outcome prediction, and ECG question answering reveal that CSFM consistently outperforms traditional one-modal-one-task approaches. Notably, CSFM exhibits robust performance across multiple ECG lead configurations from standard 12-lead systems to single-lead setups, and in scenarios where only ECG, only PPG, or a combination thereof is available. These findings highlight the potential of CSFM as a versatile and scalable solution, for comprehensive cardiac monitoring.
Related papers
- Cardiac-CLIP: A Vision-Language Foundation Model for 3D Cardiac CT Images [29.39287623923477]
We present Cardiac-CLIP, a multi-modal foundation model designed for 3D cardiac CT images.<n>CLIP is developed through a two-stage pre-training strategy.<n>CLIP is evaluated across multiple tasks, including cardiovascular abnormality classification, information retrieval and clinical analysis.
arXiv Detail & Related papers (2025-07-29T17:20:32Z) - MEETI: A Multimodal ECG Dataset from MIMIC-IV-ECG with Signals, Images, Features and Interpretations [12.00096975933262]
Electrocardiogram (ECG) plays a foundational role in modern cardiovascular care, enabling non-invasive diagnosis of arrhythmias, myocardial ischemia, and conduction disorders.<n>Most existing ECG datasets provide only single-modality data or, at most, dual modalities, making it difficult to build models that can understand and integrate diverse ECG information in real-world settings.<n>We introduce MEETI, the first large-scale ECG dataset that synchronizes raw waveform data, high-resolution plotted images, and detailed textual interpretations generated by large language models.
arXiv Detail & Related papers (2025-07-21T05:32:44Z) - Clinical NLP with Attention-Based Deep Learning for Multi-Disease Prediction [44.0876796031468]
This paper addresses the challenges posed by the unstructured nature and high-dimensional semantic complexity of electronic health record texts.<n>A deep learning method based on attention mechanisms is proposed to achieve unified modeling for information extraction and multi-label disease prediction.
arXiv Detail & Related papers (2025-07-02T07:45:22Z) - Manifold Learning for Personalized and Label-Free Detection of Cardiac Arrhythmias [0.9208007322096533]
We show that nonlinear dimensionality reduction (NLDR) can accommodate medically relevant features in ECG signals.<n>Using the MLII and V1 leads of the MIT-BIH dataset, we demonstrate that NLDR holds much promise for cardiac monitoring.
arXiv Detail & Related papers (2025-06-19T17:39:57Z) - Heartcare Suite: Multi-dimensional Understanding of ECG with Raw Multi-lead Signal Modeling [50.58126509704037]
Heartcare Suite is a framework for fine-grained electrocardiogram (ECG) understanding.<n>Heartcare-220K is a high-quality, structured, and comprehensive multimodal ECG dataset.<n>Heartcare-Bench is a benchmark to guide the optimization of Medical Multimodal Large Language Models (Med-MLLMs) in ECG scenarios.
arXiv Detail & Related papers (2025-06-06T07:56:41Z) - Electrocardiogram (ECG) Based Cardiac Arrhythmia Detection and Classification using Machine Learning Algorithms [0.0]
Machine Learning (ML) and Deep Learning (DL) have opened new prospects in medical sciences for improved diagnosis, prognosis, and treatment of severe health conditions.<n>This paper focuses on the development of an ML model with high predictive accuracy to classify arrhythmic electrocardiogram (ECG) signals.
arXiv Detail & Related papers (2024-12-07T08:29:44Z) - FedCVD: The First Real-World Federated Learning Benchmark on Cardiovascular Disease Data [52.55123685248105]
Cardiovascular diseases (CVDs) are currently the leading cause of death worldwide, highlighting the critical need for early diagnosis and treatment.
Machine learning (ML) methods can help diagnose CVDs early, but their performance relies on access to substantial data with high quality.
This paper presents the first real-world FL benchmark for cardiovascular disease detection, named FedCVD.
arXiv Detail & Related papers (2024-10-28T02:24:01Z) - Foundation Models for ECG: Leveraging Hybrid Self-Supervised Learning for Advanced Cardiac Diagnostics [2.948318253609515]
Using foundation models enhanced by self-supervised learning (SSL) methods presents an innovative approach to electrocardiogram (ECG) analysis.
This study comprehensively evaluates foundation models for ECGs, leveraging SSL methods, including generative and contrastive learning.
We developed a Hybrid Learning (HL) for foundation models that improve the precision and reliability of cardiac diagnostics.
arXiv Detail & Related papers (2024-06-26T02:24:13Z) - TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR Data [42.96821770394798]
TACCO is a novel framework that jointly discovers clusters of clinical concepts and patient visits based on a hypergraph modeling of EHR data.
We conduct experiments on the public MIMIC-III dataset and Emory internal CRADLE dataset over the downstream clinical tasks of phenotype classification and cardiovascular risk prediction.
In-depth model analysis, clustering results analysis, and clinical case studies further validate the improved utilities and insightful interpretations delivered by TACCO.
arXiv Detail & Related papers (2024-06-14T14:18:38Z) - MEIT: Multi-Modal Electrocardiogram Instruction Tuning on Large Language Models for Report Generation [41.324530807795256]
Electrocardiogram (ECG) is the primary non-invasive diagnostic tool for monitoring cardiac conditions.
Recent studies have concentrated on classifying cardiac conditions using ECG data but have overlooked ECG report generation.
We propose the Multimodal ECG Instruction Tuning (MEIT) framework, the first attempt to tackle ECG report generation with LLMs and multimodal instructions.
arXiv Detail & Related papers (2024-03-07T23:20:56Z) - Prospects for AI-Enhanced ECG as a Unified Screening Tool for Cardiac and Non-Cardiac Conditions -- An Explorative Study in Emergency Care [0.9503773054285559]
We investigate the capability of a single model to predict a diverse range of both cardiac and non-cardiac discharge diagnoses based on a sole ECG collected in the emergency department.
We find that 253, 81 cardiac, and 172 non-cardiac, ICD codes can be reliably predicted in the sense of exceeding an AUROC score of 0.8 in a statistically significant manner.
arXiv Detail & Related papers (2023-12-18T09:29:42Z) - Hierarchical Deep Learning with Generative Adversarial Network for
Automatic Cardiac Diagnosis from ECG Signals [2.5008947886814186]
We propose a two-level hierarchical deep learning framework with Generative Adversarial Network (GAN) for automatic diagnosis of ECG signals.
The first-level model is composed of a Memory-Augmented Deep auto-Encoder with GAN, which aims to differentiate abnormal signals from normal ECGs for anomaly detection.
The second-level learning aims at robust multi-class classification for different arrhythmias identification.
arXiv Detail & Related papers (2022-10-19T12:29:05Z) - ECG-DelNet: Delineation of Ambulatory Electrocardiograms with Mixed
Quality Labeling Using Neural Networks [69.25956542388653]
Deep learning (DL) algorithms are gaining weight in academic and industrial settings.
We demonstrate DL can be successfully applied to low interpretative tasks by embedding ECG detection and delineation onto a segmentation framework.
The model was trained using PhysioNet's QT database, comprised of 105 ambulatory ECG recordings.
arXiv Detail & Related papers (2020-05-11T16:29:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.