Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation
- URL: http://arxiv.org/abs/2407.21490v1
- Date: Wed, 31 Jul 2024 09:59:20 GMT
- Title: Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation
- Authors: Junxuan Yu, Rusi Chen, Yongsong Zhou, Yanlin Chen, Yaofei Duan, Yuhao Huang, Han Zhou, Tan Tao, Xin Yang, Dong Ni,
- Abstract summary: We propose an explainable and controllable method for echocardiography video generation.
First, we extract motion information from each heart substructure to construct motion curves.
Second, we propose the structure-to-motion alignment module, which can map semantic features onto motion curves.
Third, The position-aware attention mechanism is designed to enhance video consistency utilizing Gaussian masks with structural position information.
- Score: 11.879436948659691
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Echocardiography video is a primary modality for diagnosing heart diseases, but the limited data poses challenges for both clinical teaching and machine learning training. Recently, video generative models have emerged as a promising strategy to alleviate this issue. However, previous methods often relied on holistic conditions during generation, hindering the flexible movement control over specific cardiac structures. In this context, we propose an explainable and controllable method for echocardiography video generation, taking an initial frame and a motion curve as guidance. Our contributions are three-fold. First, we extract motion information from each heart substructure to construct motion curves, enabling the diffusion model to synthesize customized echocardiography videos by modifying these curves. Second, we propose the structure-to-motion alignment module, which can map semantic features onto motion curves across cardiac structures. Third, The position-aware attention mechanism is designed to enhance video consistency utilizing Gaussian masks with structural position information. Extensive experiments on three echocardiography datasets show that our method outperforms others regarding fidelity and consistency. The full code will be released at https://github.com/mlmi-2024-72/ECM.
Related papers
- Bidirectional Recurrence for Cardiac Motion Tracking with Gaussian Process Latent Coding [9.263168872795843]
GPTrack is a novel unsupervised framework crafted to explore the temporal and spatial dynamics of cardiac motion.
It enhances motion tracking by employing the sequential Gaussian Process in the latent space and encoding statistics by spatial information at each time stamp.
Our GPTrack significantly improves the precision of motion tracking in both 3D and 4D medical images while maintaining computational efficiency.
arXiv Detail & Related papers (2024-10-28T05:33:48Z) - Sequence-aware Pre-training for Echocardiography Probe Guidance [66.35766658717205]
Cardiac ultrasound faces two major challenges: (1) the inherently complex structure of the heart, and (2) significant individual variations.
Previous works have only learned the population-averaged 2D and 3D structures of the heart rather than personalized cardiac structural features.
We propose a sequence-aware self-supervised pre-training method to learn personalized 2D and 3D cardiac structural features.
arXiv Detail & Related papers (2024-08-27T12:55:54Z) - CardioSpectrum: Comprehensive Myocardium Motion Analysis with 3D Deep Learning and Geometric Insights [6.415915756409993]
Conventional neural networks have difficulty predicting subtle tangential movements.
We present a comprehensive approach to address this problem.
Our 3D deep learning architecture, based on the ARFlow model, is optimized to handle complex 3D motion analysis tasks.
arXiv Detail & Related papers (2024-07-04T09:57:44Z) - HeartBeat: Towards Controllable Echocardiography Video Synthesis with Multimodal Conditions-Guided Diffusion Models [14.280181445804226]
We propose a novel framework named HeartBeat towards controllable and high-fidelity ECHO video synthesis.
HeartBeat serves as a unified framework that enables perceiving multimodal conditions simultaneously to guide controllable generation.
In this way, users can synthesize ECHO videos that conform to their mental imagery by combining multimodal control signals.
arXiv Detail & Related papers (2024-06-20T08:24:28Z) - Echocardiography video synthesis from end diastolic semantic map via
diffusion model [0.0]
This paper aims to tackle the challenges by expanding upon existing video diffusion models for the purpose of cardiac video synthesis.
Our focus lies in generating video using semantic maps of the initial frame during the cardiac cycle, commonly referred to as end diastole.
Our model exhibits better performance compared to the standard diffusion technique in terms of multiple metrics, including FID, FVD, and SSMI.
arXiv Detail & Related papers (2023-10-11T02:08:05Z) - Semantic-aware Temporal Channel-wise Attention for Cardiac Function
Assessment [69.02116920364311]
Existing video-based methods do not pay much attention to the left ventricular region, nor the left ventricular changes caused by motion.
We propose a semi-supervised auxiliary learning paradigm with a left ventricular segmentation task, which contributes to the representation learning for the left ventricular region.
Our approach achieves state-of-the-art performance on the Stanford dataset with an improvement of 0.22 MAE, 0.26 RMSE, and 1.9% $R2$.
arXiv Detail & Related papers (2023-10-09T05:57:01Z) - Continuous 3D Myocardial Motion Tracking via Echocardiography [30.19879953016694]
Myocardial motion tracking is an essential clinical tool in the prevention and detection of cardiovascular diseases.
Current techniques suffer from incomplete and inaccurate motion estimation of the myocardium in both spatial and temporal dimensions.
This paper introduces the Neural Cardiac Motion Field (NeuralCMF) to model the 3D structure and the comprehensive 6D forward/backward motion of the heart.
arXiv Detail & Related papers (2023-10-04T13:11:20Z) - MoCaNet: Motion Retargeting in-the-wild via Canonicalization Networks [77.56526918859345]
We present a novel framework that brings the 3D motion task from controlled environments to in-the-wild scenarios.
It is capable of body motion from a character in a 2D monocular video to a 3D character without using any motion capture system or 3D reconstruction procedure.
arXiv Detail & Related papers (2021-12-19T07:52:05Z) - CS2-Net: Deep Learning Segmentation of Curvilinear Structures in Medical
Imaging [90.78899127463445]
We propose a generic and unified convolution neural network for the segmentation of curvilinear structures.
We introduce a new curvilinear structure segmentation network (CS2-Net), which includes a self-attention mechanism in the encoder and decoder.
arXiv Detail & Related papers (2020-10-15T03:06:37Z) - Learning Motion Flows for Semi-supervised Instrument Segmentation from
Robotic Surgical Video [64.44583693846751]
We study the semi-supervised instrument segmentation from robotic surgical videos with sparse annotations.
By exploiting generated data pairs, our framework can recover and even enhance temporal consistency of training sequences.
Results show that our method outperforms the state-of-the-art semisupervised methods by a large margin.
arXiv Detail & Related papers (2020-07-06T02:39:32Z) - Motion Pyramid Networks for Accurate and Efficient Cardiac Motion
Estimation [51.72616167073565]
We propose Motion Pyramid Networks, a novel deep learning-based approach for accurate and efficient cardiac motion estimation.
We predict and fuse a pyramid of motion fields from multiple scales of feature representations to generate a more refined motion field.
We then use a novel cyclic teacher-student training strategy to make the inference end-to-end and further improve the tracking performance.
arXiv Detail & Related papers (2020-06-28T21:03:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.