Echocardiography video synthesis from end diastolic semantic map via
diffusion model
- URL: http://arxiv.org/abs/2310.07131v1
- Date: Wed, 11 Oct 2023 02:08:05 GMT
- Title: Echocardiography video synthesis from end diastolic semantic map via
diffusion model
- Authors: Phi Nguyen Van, Duc Tran Minh, Hieu Pham Huy, Long Tran Quoc
- Abstract summary: This paper aims to tackle the challenges by expanding upon existing video diffusion models for the purpose of cardiac video synthesis.
Our focus lies in generating video using semantic maps of the initial frame during the cardiac cycle, commonly referred to as end diastole.
Our model exhibits better performance compared to the standard diffusion technique in terms of multiple metrics, including FID, FVD, and SSMI.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Denoising Diffusion Probabilistic Models (DDPMs) have demonstrated
significant achievements in various image and video generation tasks, including
the domain of medical imaging. However, generating echocardiography videos
based on semantic anatomical information remains an unexplored area of
research. This is mostly due to the constraints imposed by the currently
available datasets, which lack sufficient scale and comprehensive frame-wise
annotations for every cardiac cycle. This paper aims to tackle the
aforementioned challenges by expanding upon existing video diffusion models for
the purpose of cardiac video synthesis. More specifically, our focus lies in
generating video using semantic maps of the initial frame during the cardiac
cycle, commonly referred to as end diastole. To further improve the synthesis
process, we integrate spatial adaptive normalization into multiscale feature
maps. This enables the inclusion of semantic guidance during synthesis,
resulting in enhanced realism and coherence of the resultant video sequences.
Experiments are conducted on the CAMUS dataset, which is a highly used dataset
in the field of echocardiography. Our model exhibits better performance
compared to the standard diffusion technique in terms of multiple metrics,
including FID, FVD, and SSMI.
Related papers
- EchoFM: Foundation Model for Generalizable Echocardiogram Analysis [22.585990526913246]
We introduce EchoFM, a foundation model specifically designed to represent and analyze echocardiography videos.
In EchoFM, we propose a self-supervised learning framework that captures both spatial and temporal variability.
We pre-train our model on an extensive dataset comprising over 290,000 echocardiography videos, with up to 20 million frames of images.
arXiv Detail & Related papers (2024-10-30T19:32:02Z) - Explainable and Controllable Motion Curve Guided Cardiac Ultrasound Video Generation [11.879436948659691]
We propose an explainable and controllable method for echocardiography video generation.
First, we extract motion information from each heart substructure to construct motion curves.
Second, we propose the structure-to-motion alignment module, which can map semantic features onto motion curves.
Third, The position-aware attention mechanism is designed to enhance video consistency utilizing Gaussian masks with structural position information.
arXiv Detail & Related papers (2024-07-31T09:59:20Z) - Unlocking the Power of Spatial and Temporal Information in Medical Multimodal Pre-training [99.2891802841936]
We introduce the Med-ST framework for fine-grained spatial and temporal modeling.
For spatial modeling, Med-ST employs the Mixture of View Expert (MoVE) architecture to integrate different visual features from both frontal and lateral views.
For temporal modeling, we propose a novel cross-modal bidirectional cycle consistency objective by forward mapping classification (FMC) and reverse mapping regression (RMR)
arXiv Detail & Related papers (2024-05-30T03:15:09Z) - Vivim: a Video Vision Mamba for Medical Video Segmentation [52.11785024350253]
This paper presents a Video Vision Mamba-based framework, dubbed as Vivim, for medical video segmentation tasks.
Our Vivim can effectively compress the long-term representation into sequences at varying scales.
Experiments on thyroid segmentation, breast lesion segmentation in ultrasound videos, and polyp segmentation in colonoscopy videos demonstrate the effectiveness and efficiency of our Vivim.
arXiv Detail & Related papers (2024-01-25T13:27:03Z) - Diffusion Priors for Dynamic View Synthesis from Monocular Videos [59.42406064983643]
Dynamic novel view synthesis aims to capture the temporal evolution of visual content within videos.
We first finetune a pretrained RGB-D diffusion model on the video frames using a customization technique.
We distill the knowledge from the finetuned model to a 4D representations encompassing both dynamic and static Neural Radiance Fields.
arXiv Detail & Related papers (2024-01-10T23:26:41Z) - Echocardiography Segmentation Using Neural ODE-based Diffeomorphic
Registration Field [0.0]
We present a novel method for diffevolution image registration using neural ordinary differential equations (Neural ODE)
The proposed method, Echo-ODE, introduces several key improvements compared to the previous state-of-the-art.
The results show that our method surpasses the previous state-of-the-art in multiple aspects.
arXiv Detail & Related papers (2023-06-16T08:37:27Z) - Motion-Conditioned Diffusion Model for Controllable Video Synthesis [75.367816656045]
We introduce MCDiff, a conditional diffusion model that generates a video from a starting image frame and a set of strokes.
We show that MCDiff achieves the state-the-art visual quality in stroke-guided controllable video synthesis.
arXiv Detail & Related papers (2023-04-27T17:59:32Z) - Feature-Conditioned Cascaded Video Diffusion Models for Precise
Echocardiogram Synthesis [5.102090025931326]
We extend elucidated diffusion models for video modelling to generate plausible video sequences from single images.
Our image to sequence approach achieves an $R2$ score of 93%, 38 points higher than recently proposed sequence to sequence generation methods.
arXiv Detail & Related papers (2023-03-22T15:26:22Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z) - On the effectiveness of GAN generated cardiac MRIs for segmentation [12.59275199633534]
We propose a Variational Autoencoder (VAE) trained to learn the latent representations of cardiac shapes.
On the other side is a GAN that uses "SPatially-Adaptive (DE)Normalization" modules to generate realistic MR images tailored to a given anatomical map.
We show that segmentation with CNNs trained with our synthetic annotated images gets competitive results compared to traditional techniques.
arXiv Detail & Related papers (2020-05-18T18:48:38Z) - Pathological Retinal Region Segmentation From OCT Images Using Geometric
Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape.
The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.