CardiacCLIP: Video-based CLIP Adaptation for LVEF Prediction in a Few-shot Manner
- URL: http://arxiv.org/abs/2509.17065v1
- Date: Sun, 21 Sep 2025 12:52:08 GMT
- Title: CardiacCLIP: Video-based CLIP Adaptation for LVEF Prediction in a Few-shot Manner
- Authors: Yao Du, Jiarong Guo, Xiaomeng Li,
- Abstract summary: Left ventricular ejection fraction (LVEF) serves as a key indicator of heart function.<n>Existing LVEF estimation methods depend on large-scale annotated video datasets.<n>We propose CardiacCLIP, a video-based framework that enhances LVEF prediction through attention-based frame aggregation and multi-resolution input scaling.
- Score: 14.429336783145644
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Echocardiography is a vital non-invasive modality for cardiac assessment, with left ventricular ejection fraction (LVEF) serving as a key indicator of heart function. Existing LVEF estimation methods depend on large-scale annotated video datasets, which are costly and limit adaptability across various clinical settings. Recent vision-language models for echocardiography, such as EchoCLIP, apply image-to-text pretraining but fail to capture crucial temporal dynamics and localized cardiac structures essential for accurate diagnosis. To address these challenges, we propose CardiacCLIP, a video-based framework that enhances LVEF prediction through attention-based frame aggregation and multi-resolution input scaling. Specifically, we introduce MFL (Multi Frame Learning), a novel attention-based mechanism for selectively fusing informative frames, and EchoZoom, a multi-scale feature extraction strategy that refines spatial representations of cardiac structures. As a novel adaptation of CLIP models for few-shot echocardiogram video analysis, our approach significantly improves diagnostic accuracy, reducing MAE by 2.07 on the EchoNet-Dynamic dataset under 1-shot setting. The code is available at https://github.com/xmed-lab/CardiacCLIP.
Related papers
- Anatomically Constrained Transformers for Echocardiogram Analysis [38.280536446335056]
ViACT represents a deforming anatomical structure as a point set and encodes both its spatial geometry and corresponding image patches into transformer tokens.<n>During pre-training, ViACT follows a masked autoencoding strategy that masks and reconstructs only anatomical patches.<n>ViACTs generalize to myocardium point tracking without requiring task-specific components.
arXiv Detail & Related papers (2025-11-02T22:52:30Z) - A DyL-Unet framework based on dynamic learning for Temporally Consistent Echocardiographic Segmentation [0.328418927821443]
We propose DyL-UNet, a dynamic learning-based temporal consistency U-Net segmentation architecture.<n>The framework constructs an Echo-Dynamics Graph (EDG) through dynamic learning to extract dynamic information from videos.<n>Experiments on the CAMUS and EchoNet-Dynamic datasets demonstrate that DyL-UNet maintains segmentation accuracy comparable to existing methods.
arXiv Detail & Related papers (2025-09-23T14:17:01Z) - Global and Local Contrastive Learning for Joint Representations from Cardiac MRI and ECG [40.407824759778784]
PTACL (Patient and Temporal Alignment Contrastive Learning) is a multimodal contrastive learning framework that enhances ECG representations by integrating-temporal information from CMR.<n>We evaluate PTACL on paired ECG-CMR data from 27,951 subjects in the UK Biobank.<n>Our results highlight the potential of PTACL to enhance non-invasive cardiac diagnostics using ECG.
arXiv Detail & Related papers (2025-06-24T17:19:39Z) - Encoding of Demographic and Anatomical Information in Chest X-Ray-based Severe Left Ventricular Hypertrophy Classifiers [36.052936348670634]
We introduce a direct classification framework that predicts severe left ventricular hypertrophy from chest X-rays.<n>Our approach achieves high AUROC and AUPRC, and employs Mutual Information Neural Estimation to quantify feature expressivity.
arXiv Detail & Related papers (2025-05-31T13:30:04Z) - Multi-Scale Feature Fusion with Image-Driven Spatial Integration for Left Atrium Segmentation from Cardiac MRI Images [0.0]
We propose a framework that integrates DINOv2 as an encoder with a UNet-style decoder, incorporating multi-scale feature fusion and input image integration.<n>We validate our approach on the LAScarQS 2022 dataset and demonstrate improved performance with a 92.3% Dice and 84.1% IoU score for giant architecture.
arXiv Detail & Related papers (2025-02-10T16:12:46Z) - Refining Myocardial Infarction Detection: A Novel Multi-Modal Composite Kernel Strategy in One-Class Classification [14.469786240272365]
Early detection of myocardial infarction (MI) is vital to prevent further damage.
This study introduces a novel method for early MI detection using a one-class classification (OCC) algorithm in echocardiography.
Our proposed multi-view approach achieves a geometric mean of 71.24%, signifying a substantial advancement in echocardiography-based MI diagnosis.
arXiv Detail & Related papers (2024-02-09T16:41:50Z) - MEDPSeg: Hierarchical polymorphic multitask learning for the segmentation of ground-glass opacities, consolidation, and pulmonary structures on computed tomography [37.119000111386924]
MEDPSeg learns from heterogeneous chest CT targets through hierarchical polymorphic multitask learning (HPML)
We show PML enabling new state-of-the-art performance for GGO and consolidation segmentation tasks.
In addition, MEDPSeg simultaneously performs segmentation of the lung parenchyma, airways, pulmonary artery, and lung lesions, all in a single forward prediction.
arXiv Detail & Related papers (2023-12-04T21:46:39Z) - Semantic-aware Temporal Channel-wise Attention for Cardiac Function
Assessment [69.02116920364311]
Existing video-based methods do not pay much attention to the left ventricular region, nor the left ventricular changes caused by motion.
We propose a semi-supervised auxiliary learning paradigm with a left ventricular segmentation task, which contributes to the representation learning for the left ventricular region.
Our approach achieves state-of-the-art performance on the Stanford dataset with an improvement of 0.22 MAE, 0.26 RMSE, and 1.9% $R2$.
arXiv Detail & Related papers (2023-10-09T05:57:01Z) - SimLVSeg: Simplifying Left Ventricular Segmentation in 2D+Time Echocardiograms with Self- and Weakly-Supervised Learning [0.8672882547905405]
We develop SimLVSeg, a video-based network for consistent left ventricular (LV) segmentation from sparsely annotated echocardiogram videos.
SimLVSeg consists of self-supervised pre-training with temporal masking, followed by weakly supervised learning tailored for LV segmentation from sparse annotations.
We demonstrate how SimLVSeg outperforms the state-of-the-art solutions by achieving a 93.32% dice score on the largest 2D+time echocardiography dataset.
arXiv Detail & Related papers (2023-09-30T18:13:41Z) - M(otion)-mode Based Prediction of Ejection Fraction using
Echocardiograms [13.112371567924802]
We propose using the M(otion)-mode of echocardiograms for estimating the left ventricular ejection fraction (EF) and classifying cardiomyopathy.
We generate multiple artificial M-mode images from a single echocardiogram and combine them using off-the-shelf model architectures.
Our experiments show that the supervised setting converges with only ten modes and is comparable to the baseline method.
arXiv Detail & Related papers (2023-09-07T15:00:58Z) - Preservation of High Frequency Content for Deep Learning-Based Medical
Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists.
We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z) - Heart Sound Segmentation using Bidirectional LSTMs with Attention [37.62160903348547]
We propose a novel framework for the segmentation of phonocardiogram (PCG) signals into heart states.
We exploit recent advancements in attention based learning to segment the PCG signal.
The proposed method attains state-of-the-art performance on multiple benchmarks including both human and animal heart recordings.
arXiv Detail & Related papers (2020-04-02T02:09:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.