CardioBench: Do Echocardiography Foundation Models Generalize Beyond the Lab?
- URL: http://arxiv.org/abs/2510.00520v1
- Date: Wed, 01 Oct 2025 05:09:48 GMT
- Title: CardioBench: Do Echocardiography Foundation Models Generalize Beyond the Lab?
- Authors: Darya Taratynova, Ahmed Aly, Numan Saeed, Mohammad Yaqub,
- Abstract summary: Foundation models (FMs) are reshaping medical imaging, yet their application in echocardiography remains limited.<n>We introduce CardioBench, a comprehensive benchmark for echocardiography FMs.<n> CardioBench unifies eight publicly available datasets into a standardized suite spanning four regression and five classification tasks.
- Score: 4.445909648625997
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Foundation models (FMs) are reshaping medical imaging, yet their application in echocardiography remains limited. While several echocardiography-specific FMs have recently been introduced, no standardized benchmark exists to evaluate them. Echocardiography poses unique challenges, including noisy acquisitions, high frame redundancy, and limited public datasets. Most existing solutions evaluate on private data, restricting comparability. To address this, we introduce CardioBench, a comprehensive benchmark for echocardiography FMs. CardioBench unifies eight publicly available datasets into a standardized suite spanning four regression and five classification tasks, covering functional, structural, diagnostic, and view recognition endpoints. We evaluate several leading FM, including cardiac-specific, biomedical, and general-purpose encoders, under consistent zero-shot, probing, and alignment protocols. Our results highlight complementary strengths across model families: temporal modeling is critical for functional regression, retrieval provides robustness under distribution shift, and domain-specific text encoders capture physiologically meaningful axes. General-purpose encoders transfer strongly and often close the gap with probing, but struggle with fine-grained distinctions like view classification and subtle pathology recognition. By releasing preprocessing, splits, and public evaluation pipelines, CardioBench establishes a reproducible reference point and offers actionable insights to guide the design of future echocardiography foundation models.
Related papers
- Enabling Ultra-Fast Cardiovascular Imaging Across Heterogeneous Clinical Environments with a Generalist Foundation Model and Multimodal Database [64.65360708629485]
MMCMR-427K is the largest and most comprehensive multimodal cardiovascular magnetic resonance k-space database.<n> CardioMM is a reconstruction foundation model capable of adapting to heterogeneous fast CMR imaging scenarios.<n> CardioMM unifies semantic contextual understanding with physics-informed data consistency to deliver robust reconstructions.
arXiv Detail & Related papers (2025-12-25T12:47:50Z) - Echo-CoPilot: A Multi-View, Multi-Task Agent for Echocardiography Interpretation and Reporting [8.162197738994479]
We introduce Echo-CoPilot, a multi-view, multi-task agent that uses a large language model to orchestrate specialized echocardiography tools.<n>Within a ReAct-style loop, the agent decomposes clinician queries, invokes tools for view recognition, cardiac structure segmentation, measurement and disease prediction, and report synthesis.<n>We evaluate Echo-CoPilot on the public MIMIC-EchoQA benchmark, where it achieves an accuracy of 50.8%, outperforming both general-purpose and biomedical video vision-language models.
arXiv Detail & Related papers (2025-12-06T23:27:54Z) - Adapting HFMCA to Graph Data: Self-Supervised Learning for Generalizable fMRI Representations [57.054499278843856]
Functional magnetic resonance imaging (fMRI) analysis faces significant challenges due to limited dataset sizes and domain variability between studies.<n>Traditional self-supervised learning methods inspired by computer vision often rely on positive and negative sample pairs.<n>We propose adapting a recently developed Hierarchical Functional Maximal Correlation Algorithm (HFMCA) to graph-structured fMRI data.
arXiv Detail & Related papers (2025-10-05T12:35:01Z) - CardiacCLIP: Video-based CLIP Adaptation for LVEF Prediction in a Few-shot Manner [14.429336783145644]
Left ventricular ejection fraction (LVEF) serves as a key indicator of heart function.<n>Existing LVEF estimation methods depend on large-scale annotated video datasets.<n>We propose CardiacCLIP, a video-based framework that enhances LVEF prediction through attention-based frame aggregation and multi-resolution input scaling.
arXiv Detail & Related papers (2025-09-21T12:52:08Z) - EchoFM: Foundation Model for Generalizable Echocardiogram Analysis [22.585990526913246]
We introduce EchoFM, a foundation model specifically designed to represent and analyze echocardiography videos.<n>In EchoFM, we propose a self-supervised learning framework that captures both spatial and temporal variability.<n>We pre-train our model on an extensive dataset comprising over 290,000 echocardiography videos, with up to 20 million frames of images.
arXiv Detail & Related papers (2024-10-30T19:32:02Z) - Sequence-aware Pre-training for Echocardiography Probe Movement Guidance [71.79421124144145]
We introduce a novel probe movement guidance algorithm that has the potential to be applied in guiding robotic systems or novices with probe pose adjustment for high-quality standard plane image acquisition.<n>Our approach learns personalized three-dimensional cardiac structural features by predicting the masked-out image features and probe movement actions in a scanning sequence.
arXiv Detail & Related papers (2024-08-27T12:55:54Z) - Certification of Deep Learning Models for Medical Image Segmentation [44.177565298565966]
We present for the first time a certified segmentation baseline for medical imaging based on randomized smoothing and diffusion models.
Our results show that leveraging the power of denoising diffusion probabilistic models helps us overcome the limits of randomized smoothing.
arXiv Detail & Related papers (2023-10-05T16:40:33Z) - K-Space-Aware Cross-Modality Score for Synthesized Neuroimage Quality
Assessment [71.27193056354741]
The problem of how to assess cross-modality medical image synthesis has been largely unexplored.
We propose a new metric K-CROSS to spur progress on this challenging problem.
K-CROSS uses a pre-trained multi-modality segmentation network to predict the lesion location.
arXiv Detail & Related papers (2023-07-10T01:26:48Z) - Extraction of volumetric indices from echocardiography: which deep
learning solution for clinical use? [6.144041824426555]
We show that the proposed 3D nnU-Net outperforms alternative 2D and recurrent segmentation methods.
Overall, the experimental results suggest that with sufficient training data, 3D nnU-Net could become the first automated tool to meet the standards of an everyday clinical device.
arXiv Detail & Related papers (2023-05-03T09:38:52Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Factored Attention and Embedding for Unstructured-view Topic-related
Ultrasound Report Generation [70.7778938191405]
We propose a novel factored attention and embedding model (termed FAE-Gen) for the unstructured-view topic-related ultrasound report generation.
The proposed FAE-Gen mainly consists of two modules, i.e., view-guided factored attention and topic-oriented factored embedding, which capture the homogeneous and heterogeneous morphological characteristic across different views.
arXiv Detail & Related papers (2022-03-12T15:24:03Z) - Reciprocal Landmark Detection and Tracking with Extremely Few
Annotations [10.115679843920958]
We propose a new end-to-end reciprocal detection and tracking model to handle the sparse nature of echocardiography labels.
The model is trained using few annotated frames across the entire cardiac cine sequence to generate consistent detection and tracking of landmarks.
arXiv Detail & Related papers (2021-01-27T06:59:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.