Label-free Motion-Conditioned Diffusion Model for Cardiac Ultrasound Synthesis
- URL: http://arxiv.org/abs/2512.09418v1
- Date: Wed, 10 Dec 2025 08:32:34 GMT
- Title: Label-free Motion-Conditioned Diffusion Model for Cardiac Ultrasound Synthesis
- Authors: Zhe Li, Hadrien Reynaud, Johanna P Müller, Bernhard Kainz,
- Abstract summary: We propose the Motion Conditioned Diffusion Model (MCDM), a label-free latent diffusion framework that synthesises realistic echocardiography videos conditioned on self-supervised motion features.<n>MCDM achieves competitive video generation performance, producing temporally coherent and clinically realistic sequences without reliance on manual labels.
- Score: 13.306765004903118
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Ultrasound echocardiography is essential for the non-invasive, real-time assessment of cardiac function, but the scarcity of labelled data, driven by privacy restrictions and the complexity of expert annotation, remains a major obstacle for deep learning methods. We propose the Motion Conditioned Diffusion Model (MCDM), a label-free latent diffusion framework that synthesises realistic echocardiography videos conditioned on self-supervised motion features. To extract these features, we design the Motion and Appearance Feature Extractor (MAFE), which disentangles motion and appearance representations from videos. Feature learning is further enhanced by two auxiliary objectives: a re-identification loss guided by pseudo appearance features and an optical flow loss guided by pseudo flow fields. Evaluated on the EchoNet-Dynamic dataset, MCDM achieves competitive video generation performance, producing temporally coherent and clinically realistic sequences without reliance on manual labels. These results demonstrate the potential of self-supervised conditioning for scalable echocardiography synthesis. Our code is available at https://github.com/ZheLi2020/LabelfreeMCDM.
Related papers
- InfoMotion: A Graph-Based Approach to Video Dataset Distillation for Echocardiography [12.676788334083332]
We propose a novel approach for distilling a compact synthetic echocardiographic video dataset.<n>We evaluate our approach on the EchoNet-Dynamic datasets and achieve a test accuracy of (69.38%) using only (25) synthetic videos.
arXiv Detail & Related papers (2025-12-10T08:39:25Z) - Extreme Cardiac MRI Analysis under Respiratory Motion: Results of the CMRxMotion Challenge [56.28872161153236]
Deep learning models have achieved state-of-the-art performance in automated Cardiac Magnetic Resonance (CMR) analysis.<n>The efficacy of these models is highly dependent on the availability of high-quality, artifact-free images.<n>To promote research in this domain, we organized the MICCAI CMRxMotion challenge.
arXiv Detail & Related papers (2025-07-25T11:12:21Z) - Systole-Conditioned Generative Cardiac Motion [14.94166259218979]
We present a novel approach that synthesizes realistically looking pairs of cardiac CT frames enriched with dense 3D flow field annotations.<n>Our method leverages a conditional Variational Autoencoder (CVAE), which incorporates a novel multi-scale feature conditioning mechanism.<n>Our data generation pipeline could enable the training and validation of more complex and accurate myocardium motion models.
arXiv Detail & Related papers (2025-07-20T14:44:40Z) - High-Fidelity Functional Ultrasound Reconstruction via A Visual Auto-Regressive Framework [58.07923338080814]
Functional neurotemporal imaging provides exceptional resolution for mapping.<n>However, its practical application is hampered by critical challenges.<n>These include data scarcity, ethical considerations and signal degradation.
arXiv Detail & Related papers (2025-05-23T15:27:17Z) - EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance [79.66329903007869]
We present EchoWorld, a motion-aware world modeling framework for probe guidance.<n>It encodes anatomical knowledge and motion-induced visual dynamics.<n>It is trained on more than one million ultrasound images from over 200 routine scans.
arXiv Detail & Related papers (2025-04-17T16:19:05Z) - EchoFlow: A Foundation Model for Cardiac Ultrasound Image and Video Generation [6.849095682774907]
We present EchoFlow, a novel framework designed to generate high-quality, privacy-preserving synthetic echocardiogram images and videos.<n>We rigorously evaluate our synthetic datasets on the clinically relevant task of ejection fraction regression and demonstrate, for the first time, that downstream models trained exclusively on EchoFlow-generated synthetic datasets achieve performance parity with models trained on real datasets.
arXiv Detail & Related papers (2025-03-28T11:51:59Z) - LaMoD: Latent Motion Diffusion Model For Myocardial Strain Generation [5.377722774297911]
We introduce a novel Latent Motion Diffusion model (LaMoD) to predict highly accurate DENSE motions from standard CMR videos.<n> Experimental results demonstrate that our proposed method, LaMoD, significantly improves the accuracy of motion analysis in standard CMR images.
arXiv Detail & Related papers (2024-07-02T12:54:32Z) - CathFlow: Self-Supervised Segmentation of Catheters in Interventional Ultrasound Using Optical Flow and Transformers [66.15847237150909]
We introduce a self-supervised deep learning architecture to segment catheters in longitudinal ultrasound images.
The network architecture builds upon AiAReSeg, a segmentation transformer built with the Attention in Attention mechanism.
We validated our model on a test dataset, consisting of unseen synthetic data and images collected from silicon aorta phantoms.
arXiv Detail & Related papers (2024-03-21T15:13:36Z) - Echocardiography video synthesis from end diastolic semantic map via
diffusion model [0.0]
This paper aims to tackle the challenges by expanding upon existing video diffusion models for the purpose of cardiac video synthesis.
Our focus lies in generating video using semantic maps of the initial frame during the cardiac cycle, commonly referred to as end diastole.
Our model exhibits better performance compared to the standard diffusion technique in terms of multiple metrics, including FID, FVD, and SSMI.
arXiv Detail & Related papers (2023-10-11T02:08:05Z) - Semantic-aware Temporal Channel-wise Attention for Cardiac Function
Assessment [69.02116920364311]
Existing video-based methods do not pay much attention to the left ventricular region, nor the left ventricular changes caused by motion.
We propose a semi-supervised auxiliary learning paradigm with a left ventricular segmentation task, which contributes to the representation learning for the left ventricular region.
Our approach achieves state-of-the-art performance on the Stanford dataset with an improvement of 0.22 MAE, 0.26 RMSE, and 1.9% $R2$.
arXiv Detail & Related papers (2023-10-09T05:57:01Z) - Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM)
CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks.
We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.