Related papers: Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond

Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond

URL: http://arxiv.org/abs/2504.13037v2
Date: Fri, 18 Apr 2025 09:26:55 GMT
Title: Towards Cardiac MRI Foundation Models: Comprehensive Visual-Tabular Representations for Whole-Heart Assessment and Beyond
Authors: Yundi Zhang, Paul Hager, Che Liu, Suprosanna Shit, Chen Chen, Daniel Rueckert, Jiazhen Pan,
Abstract summary: ViTa integrates 3D+T stacks from shortaxis long-axis views, enabling a complete capture of the cardiac cycle.<n>This multi-modal paradigm supports a wide spectrum of downstream tasks, including cardiac phenotype and physiological feature prediction.<n>By learning a shared latent representation that bridges rich imaging features and patient context, ViTa moves beyond traditional task-specific models.
Score: 17.12109841946122
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cardiac magnetic resonance imaging is the gold standard for non-invasive cardiac assessment, offering rich spatio-temporal views of the cardiac anatomy and physiology. Patient-level health factors, such as demographics, metabolic, and lifestyle, are known to substantially influence cardiovascular health and disease risk, yet remain uncaptured by CMR alone. To holistically understand cardiac health and to enable the best possible interpretation of an individual's disease risk, CMR and patient-level factors must be jointly exploited within an integrated framework. Recent multi-modal approaches have begun to bridge this gap, yet they often rely on limited spatio-temporal data and focus on isolated clinical tasks, thereby hindering the development of a comprehensive representation for cardiac health evaluation. To overcome these limitations, we introduce ViTa, a step toward foundation models that delivers a comprehensive representation of the heart and a precise interpretation of individual disease risk. Leveraging data from 42,000 UK Biobank participants, ViTa integrates 3D+T cine stacks from short-axis and long-axis views, enabling a complete capture of the cardiac cycle. These imaging data are then fused with detailed tabular patient-level factors, enabling context-aware insights. This multi-modal paradigm supports a wide spectrum of downstream tasks, including cardiac phenotype and physiological feature prediction, segmentation, and classification of cardiac and metabolic diseases within a single unified framework. By learning a shared latent representation that bridges rich imaging features and patient context, ViTa moves beyond traditional, task-specific models toward a universal, patient-specific understanding of cardiac health, highlighting its potential to advance clinical utility and scalability in cardiac analysis.

Related papers

Global and Local Contrastive Learning for Joint Representations from Cardiac MRI and ECG [40.407824759778784]
PTACL (Patient and Temporal Alignment Contrastive Learning) is a multimodal contrastive learning framework that enhances ECG representations by integrating-temporal information from CMR.<n>We evaluate PTACL on paired ECG-CMR data from 27,951 subjects in the UK Biobank.<n>Our results highlight the potential of PTACL to enhance non-invasive cardiac diagnostics using ECG.
arXiv Detail & Related papers (2025-06-24T17:19:39Z)
MOSCARD -- Causal Reasoning and De-confounding for Multimodal Opportunistic Screening of Cardiovascular Adverse Events [3.206697649226124]
Major Adverse Cardiovascular Events (MACE) remain the leading cause of mortality globally, as reported in the Global Disease Study 2021.<n>Opportunistic screening leverages data collected from routine health check-ups and multimodal data can play a key role to identify at-risk individuals.<n>We propose a novel predictive modeling framework - MOSCARD, multimodal causal reasoning with co-attention to align two distinct modalities and simultaneously mitigate bias and confounders in opportunistic risk estimation.
arXiv Detail & Related papers (2025-06-23T22:28:37Z)
Heartcare Suite: Multi-dimensional Understanding of ECG with Raw Multi-lead Signal Modeling [50.58126509704037]
Heartcare Suite is a framework for fine-grained electrocardiogram (ECG) understanding.<n>Heartcare-220K is a high-quality, structured, and comprehensive multimodal ECG dataset.<n>Heartcare-Bench is a benchmark to guide the optimization of Medical Multimodal Large Language Models (Med-MLLMs) in ECG scenarios.
arXiv Detail & Related papers (2025-06-06T07:56:41Z)
Adaptable Cardiovascular Disease Risk Prediction from Heterogeneous Data using Large Language Models [70.64969663547703]
AdaCVD is an adaptable CVD risk prediction framework built on large language models extensively fine-tuned on over half a million participants from the UK Biobank.<n>It addresses key clinical challenges across three dimensions: it flexibly incorporates comprehensive yet variable patient information; it seamlessly integrates both structured data and unstructured text; and it rapidly adapts to new patient populations using minimal additional data.
arXiv Detail & Related papers (2025-05-30T14:42:02Z)
Leveraging Cardiovascular Simulations for In-Vivo Prediction of Cardiac Biomarkers [43.17768785084301]
We train an amortized neural posterior estimator on a newly built large dataset of cardiac simulations.<n>We incorporate elements modeling effects to better align simulated data with real-world measurements.<n>The proposed framework can further integrate in-vivo data sources to refine its predictive capabilities on real-world data.
arXiv Detail & Related papers (2024-12-23T13:05:17Z)
Enhancing Cardiovascular Disease Prediction through Multi-Modal Self-Supervised Learning [0.17708284654788597]
We propose a comprehensive framework for enhancing cardiovascular disease prediction with limited annotated datasets. We employ a masked autoencoder to pre-train the electrocardiogram ECG encoder, enabling it to extract relevant features from raw electrocardiogram data. We fine-tuned the pre-trained encoders on specific predictive tasks, such as myocardial infarction.
arXiv Detail & Related papers (2024-11-08T16:32:30Z)
CTPD: Cross-Modal Temporal Pattern Discovery for Enhanced Multimodal Electronic Health Records Analysis [46.56667527672019]
We introduce a Cross-Modal Temporal Pattern Discovery (CTPD) framework, designed to efficiently extract meaningful cross-modal temporal patterns from multimodal EHR data. Our approach introduces shared initial temporal pattern representations which are refined using slot attention to generate temporal semantic embeddings.
arXiv Detail & Related papers (2024-11-01T15:54:07Z)
CardiacNet: Learning to Reconstruct Abnormalities for Cardiac Disease Assessment from Echocardiogram Videos [10.06966396329022]
We propose a novel reconstruction-based approach named CardiacNet to learn a better representation of local cardiac structures and motion abnormalities. CardiacNet is accompanied by the Consistency Deformation Codebook (CDC) and the Consistency Deformed-Discriminator (CDD) to learn the commonalities across abnormal and normal samples. In experiments, our CardiacNet can achieve state-of-the-art results in three different cardiac disease assessment tasks.
arXiv Detail & Related papers (2024-10-28T06:11:03Z)
CMRxRecon2024: A Multi-Modality, Multi-View K-Space Dataset Boosting Universal Machine Learning for Accelerated Cardiac MRI [40.11088079783521]
The CMRxRecon2024 dataset is the largest and most protocal-diverse publicly available cardiac k-space dataset.<n>It is acquired from 330 healthy volunteers, covering commonly used modalities, anatomical views, and acquisition trajectories in clinical cardiac MRI.
arXiv Detail & Related papers (2024-06-27T09:50:20Z)
Cardiac Copilot: Automatic Probe Guidance for Echocardiography with World Model [66.35766658717205]
There is a severe shortage of experienced cardiac sonographers, due to the heart's complex structure and significant operational challenges. We present a Cardiac Copilot system capable of providing real-time probe movement guidance. The core innovation lies in proposing a data-driven world model, named Cardiac Dreamer, for representing cardiac spatial structures. We train our model with real-world ultrasound data and corresponding probe motion from 110 routine clinical scans with 151K sample pairs by three certified sonographers.
arXiv Detail & Related papers (2024-06-19T02:42:29Z)
TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR Data [42.96821770394798]
TACCO is a novel framework that jointly discovers clusters of clinical concepts and patient visits based on a hypergraph modeling of EHR data. We conduct experiments on the public MIMIC-III dataset and Emory internal CRADLE dataset over the downstream clinical tasks of phenotype classification and cardiovascular risk prediction. In-depth model analysis, clustering results analysis, and clinical case studies further validate the improved utilities and insightful interpretations delivered by TACCO.
arXiv Detail & Related papers (2024-06-14T14:18:38Z)
Whole Heart 3D+T Representation Learning Through Sparse 2D Cardiac MR Images [13.686473040836113]
We introduce a whole-heart self-supervised learning framework to automatically uncover the correlations between spatial and temporal patches throughout the cardiac stacks. We train our model on 14,000 unlabeled CMR data from UK BioBank and evaluate it on 1,000 annotated data.
arXiv Detail & Related papers (2024-06-01T07:08:45Z)
A Generalizable Deep Learning System for Cardiac MRI [29.429744474335347]
We describe a foundational vision system for cardiac MRI, capable of representing the breadth of human cardiovascular disease and health. Our deep learning model is trained via self-supervised contrastive learning, by which visual concepts in cine-sequence cardiac MRI scans are learned from the raw text of the accompanying radiology reports. We show that our deep learning system is capable of not only understanding the staggering complexity of human cardiovascular disease, but can be directed towards clinical problems of interest yielding impressive, clinical grade diagnostic accuracy with a fraction of the training data typically required for such tasks.
arXiv Detail & Related papers (2023-12-01T05:27:29Z)
Continuous 3D Myocardial Motion Tracking via Echocardiography [30.19879953016694]
Myocardial motion tracking is an essential clinical tool in the prevention and detection of cardiovascular diseases. Current techniques suffer from incomplete and inaccurate motion estimation of the myocardium in both spatial and temporal dimensions. This paper introduces the Neural Cardiac Motion Field (NeuralCMF) to model the 3D structure and the comprehensive 6D forward/backward motion of the heart.
arXiv Detail & Related papers (2023-10-04T13:11:20Z)
Three-dimensional micro-structurally informed in silico myocardium -- towards virtual imaging trials in cardiac diffusion weighted MRI [58.484353709077034]
We propose a novel method to generate a realistic numerical phantom of myocardial microstructure. In-silico tissue models enable evaluating quantitative models of magnetic resonance imaging.
arXiv Detail & Related papers (2022-08-22T22:01:44Z)
MyoPS: A Benchmark of Myocardial Pathology Segmentation Combining Three-Sequence Cardiac Magnetic Resonance Images [84.02849948202116]
This work defines a new task of medical image analysis, i.e., to perform myocardial pathology segmentation (MyoPS) MyoPS combines three-sequence cardiac magnetic resonance (CMR) images, which was first proposed in the MyoPS challenge, in conjunction with MICCAI 2020. The challenge provided 45 paired and pre-aligned CMR images, allowing algorithms to combine the complementary information from the three CMR sequences for pathology segmentation.
arXiv Detail & Related papers (2022-01-10T06:37:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.