Related papers: CardiacCLIP: Video-based CLIP Adaptation for LVEF Prediction in a Few-shot Manner

CardiacCLIP: Video-based CLIP Adaptation for LVEF Prediction in a Few-shot Manner

URL: http://arxiv.org/abs/2509.17065v1
Date: Sun, 21 Sep 2025 12:52:08 GMT
Title: CardiacCLIP: Video-based CLIP Adaptation for LVEF Prediction in a Few-shot Manner
Authors: Yao Du, Jiarong Guo, Xiaomeng Li,
Abstract summary: Left ventricular ejection fraction (LVEF) serves as a key indicator of heart function.<n>Existing LVEF estimation methods depend on large-scale annotated video datasets.<n>We propose CardiacCLIP, a video-based framework that enhances LVEF prediction through attention-based frame aggregation and multi-resolution input scaling.
Score: 14.429336783145644
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Echocardiography is a vital non-invasive modality for cardiac assessment, with left ventricular ejection fraction (LVEF) serving as a key indicator of heart function. Existing LVEF estimation methods depend on large-scale annotated video datasets, which are costly and limit adaptability across various clinical settings. Recent vision-language models for echocardiography, such as EchoCLIP, apply image-to-text pretraining but fail to capture crucial temporal dynamics and localized cardiac structures essential for accurate diagnosis. To address these challenges, we propose CardiacCLIP, a video-based framework that enhances LVEF prediction through attention-based frame aggregation and multi-resolution input scaling. Specifically, we introduce MFL (Multi Frame Learning), a novel attention-based mechanism for selectively fusing informative frames, and EchoZoom, a multi-scale feature extraction strategy that refines spatial representations of cardiac structures. As a novel adaptation of CLIP models for few-shot echocardiogram video analysis, our approach significantly improves diagnostic accuracy, reducing MAE by 2.07 on the EchoNet-Dynamic dataset under 1-shot setting. The code is available at https://github.com/xmed-lab/CardiacCLIP.

Related papers

Anatomically Constrained Transformers for Echocardiogram Analysis [38.280536446335056]
ViACT represents a deforming anatomical structure as a point set and encodes both its spatial geometry and corresponding image patches into transformer tokens.<n>During pre-training, ViACT follows a masked autoencoding strategy that masks and reconstructs only anatomical patches.<n>ViACTs generalize to myocardium point tracking without requiring task-specific components.
arXiv Detail & Related papers (2025-11-02T22:52:30Z)
A DyL-Unet framework based on dynamic learning for Temporally Consistent Echocardiographic Segmentation [0.328418927821443]
We propose DyL-UNet, a dynamic learning-based temporal consistency U-Net segmentation architecture.<n>The framework constructs an Echo-Dynamics Graph (EDG) through dynamic learning to extract dynamic information from videos.<n>Experiments on the CAMUS and EchoNet-Dynamic datasets demonstrate that DyL-UNet maintains segmentation accuracy comparable to existing methods.
arXiv Detail & Related papers (2025-09-23T14:17:01Z)
Global and Local Contrastive Learning for Joint Representations from Cardiac MRI and ECG [40.407824759778784]
PTACL (Patient and Temporal Alignment Contrastive Learning) is a multimodal contrastive learning framework that enhances ECG representations by integrating-temporal information from CMR.<n>We evaluate PTACL on paired ECG-CMR data from 27,951 subjects in the UK Biobank.<n>Our results highlight the potential of PTACL to enhance non-invasive cardiac diagnostics using ECG.
arXiv Detail & Related papers (2025-06-24T17:19:39Z)
Encoding of Demographic and Anatomical Information in Chest X-Ray-based Severe Left Ventricular Hypertrophy Classifiers [36.052936348670634]
We introduce a direct classification framework that predicts severe left ventricular hypertrophy from chest X-rays.<n>Our approach achieves high AUROC and AUPRC, and employs Mutual Information Neural Estimation to quantify feature expressivity.
arXiv Detail & Related papers (2025-05-31T13:30:04Z)
Multi-Scale Feature Fusion with Image-Driven Spatial Integration for Left Atrium Segmentation from Cardiac MRI Images [0.0]
We propose a framework that integrates DINOv2 as an encoder with a UNet-style decoder, incorporating multi-scale feature fusion and input image integration.<n>We validate our approach on the LAScarQS 2022 dataset and demonstrate improved performance with a 92.3% Dice and 84.1% IoU score for giant architecture.
arXiv Detail & Related papers (2025-02-10T16:12:46Z)
Refining Myocardial Infarction Detection: A Novel Multi-Modal Composite Kernel Strategy in One-Class Classification [14.469786240272365]
Early detection of myocardial infarction (MI) is vital to prevent further damage. This study introduces a novel method for early MI detection using a one-class classification (OCC) algorithm in echocardiography. Our proposed multi-view approach achieves a geometric mean of 71.24%, signifying a substantial advancement in echocardiography-based MI diagnosis.
arXiv Detail & Related papers (2024-02-09T16:41:50Z)
MEDPSeg: Hierarchical polymorphic multitask learning for the segmentation of ground-glass opacities, consolidation, and pulmonary structures on computed tomography [37.119000111386924]
MEDPSeg learns from heterogeneous chest CT targets through hierarchical polymorphic multitask learning (HPML) We show PML enabling new state-of-the-art performance for GGO and consolidation segmentation tasks. In addition, MEDPSeg simultaneously performs segmentation of the lung parenchyma, airways, pulmonary artery, and lung lesions, all in a single forward prediction.
arXiv Detail & Related papers (2023-12-04T21:46:39Z)
Semantic-aware Temporal Channel-wise Attention for Cardiac Function Assessment [69.02116920364311]
Existing video-based methods do not pay much attention to the left ventricular region, nor the left ventricular changes caused by motion. We propose a semi-supervised auxiliary learning paradigm with a left ventricular segmentation task, which contributes to the representation learning for the left ventricular region. Our approach achieves state-of-the-art performance on the Stanford dataset with an improvement of 0.22 MAE, 0.26 RMSE, and 1.9% $R2$.
arXiv Detail & Related papers (2023-10-09T05:57:01Z)
SimLVSeg: Simplifying Left Ventricular Segmentation in 2D+Time Echocardiograms with Self- and Weakly-Supervised Learning [0.8672882547905405]
We develop SimLVSeg, a video-based network for consistent left ventricular (LV) segmentation from sparsely annotated echocardiogram videos. SimLVSeg consists of self-supervised pre-training with temporal masking, followed by weakly supervised learning tailored for LV segmentation from sparse annotations. We demonstrate how SimLVSeg outperforms the state-of-the-art solutions by achieving a 93.32% dice score on the largest 2D+time echocardiography dataset.
arXiv Detail & Related papers (2023-09-30T18:13:41Z)
M(otion)-mode Based Prediction of Ejection Fraction using Echocardiograms [13.112371567924802]
We propose using the M(otion)-mode of echocardiograms for estimating the left ventricular ejection fraction (EF) and classifying cardiomyopathy. We generate multiple artificial M-mode images from a single echocardiogram and combine them using off-the-shelf model architectures. Our experiments show that the supervised setting converges with only ten modes and is comparable to the baseline method.
arXiv Detail & Related papers (2023-09-07T15:00:58Z)
Preservation of High Frequency Content for Deep Learning-Based Medical Image Classification [74.84221280249876]
An efficient analysis of large amounts of chest radiographs can aid physicians and radiologists. We propose a novel Discrete Wavelet Transform (DWT)-based method for the efficient identification and encoding of visual information.
arXiv Detail & Related papers (2022-05-08T15:29:54Z)
Heart Sound Segmentation using Bidirectional LSTMs with Attention [37.62160903348547]
We propose a novel framework for the segmentation of phonocardiogram (PCG) signals into heart states. We exploit recent advancements in attention based learning to segment the PCG signal. The proposed method attains state-of-the-art performance on multiple benchmarks including both human and animal heart recordings.
arXiv Detail & Related papers (2020-04-02T02:09:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.