Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation
- URL: http://arxiv.org/abs/2405.12223v1
- Date: Sat, 6 Apr 2024 03:02:47 GMT
- Title: Cascaded Multi-path Shortcut Diffusion Model for Medical Image Translation
- Authors: Yinchi Zhou, Tianqi Chen, Jun Hou, Huidong Xie, Nicha C. Dvornek, S. Kevin Zhou, David L. Wilson, James S. Duncan, Chi Liu, Bo Zhou,
- Abstract summary: We propose a Cascade Multi-path Shortcut Diffusion Model (CMDM) for high-quality medical image translation and uncertainty estimation.
Our experimental results found that CMDM can produce high-quality translations comparable to state-of-the-art methods.
- Score: 26.67518950976257
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Image-to-image translation is a vital component in medical imaging processing, with many uses in a wide range of imaging modalities and clinical scenarios. Previous methods include Generative Adversarial Networks (GANs) and Diffusion Models (DMs), which offer realism but suffer from instability and lack uncertainty estimation. Even though both GAN and DM methods have individually exhibited their capability in medical image translation tasks, the potential of combining a GAN and DM to further improve translation performance and to enable uncertainty estimation remains largely unexplored. In this work, we address these challenges by proposing a Cascade Multi-path Shortcut Diffusion Model (CMDM) for high-quality medical image translation and uncertainty estimation. To reduce the required number of iterations and ensure robust performance, our method first obtains a conditional GAN-generated prior image that will be used for the efficient reverse translation with a DM in the subsequent step. Additionally, a multi-path shortcut diffusion strategy is employed to refine translation results and estimate uncertainty. A cascaded pipeline further enhances translation quality, incorporating residual averaging between cascades. We collected three different medical image datasets with two sub-tasks for each dataset to test the generalizability of our approach. Our experimental results found that CMDM can produce high-quality translations comparable to state-of-the-art methods while providing reasonable uncertainty estimations that correlate well with the translation error.
Related papers
- Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model [101.65105730838346]
We introduce Transfusion, a recipe for training a multi-modal model over discrete and continuous data.
We pretrain multiple Transfusion models up to 7B parameters from scratch on a mixture of text and image data.
Our experiments show that Transfusion scales significantly better than quantizing images and training a language model over discrete image tokens.
arXiv Detail & Related papers (2024-08-20T17:48:20Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - Uncertainty Estimation in Contrast-Enhanced MR Image Translation with
Multi-Axis Fusion [6.727287631338148]
We propose a novel model uncertainty quantification method, Multi-Axis Fusion (MAF)
The proposed approach is applied to the task of synthesizing contrast enhanced T1-weighted images based on native T1, T2 and T2-FLAIR scans.
arXiv Detail & Related papers (2023-11-20T20:09:48Z) - C^2M-DoT: Cross-modal consistent multi-view medical report generation
with domain transfer network [67.97926983664676]
We propose a cross-modal consistent multi-view medical report generation with a domain transfer network (C2M-DoT)
C2M-DoT substantially outperforms state-of-the-art baselines in all metrics.
arXiv Detail & Related papers (2023-10-09T02:31:36Z) - Zero-shot Medical Image Translation via Frequency-Guided Diffusion
Models [9.15810015583615]
We propose a frequency-guided diffusion model (FGDM) that employs frequency-domain filters to guide the diffusion model for structure-preserving image translation.
Based on its design, FGDM allows zero-shot learning, as it can be trained solely on the data from the target domain, and used directly for source-to-target domain translation.
FGDM outperformed the state-of-the-art methods (GAN-based, VAE-based, and diffusion-based) in metrics of Frechet Inception Distance (FID), Peak Signal-to-Noise Ratio (PSNR), and Structural Similarity
arXiv Detail & Related papers (2023-04-05T20:47:40Z) - Tackling Ambiguity with Images: Improved Multimodal Machine Translation
and Contrastive Evaluation [72.6667341525552]
We present a new MMT approach based on a strong text-only MT model, which uses neural adapters and a novel guided self-attention mechanism.
We also introduce CoMMuTE, a Contrastive Multimodal Translation Evaluation set of ambiguous sentences and their possible translations.
Our approach obtains competitive results compared to strong text-only models on standard English-to-French, English-to-German and English-to-Czech benchmarks.
arXiv Detail & Related papers (2022-12-20T10:18:18Z) - Unsupervised Medical Image Translation with Adversarial Diffusion Models [0.2770822269241974]
Imputation of missing images via source-to-target modality translation can improve diversity in medical imaging protocols.
Here, we propose a novel method based on adversarial diffusion modeling, SynDiff, for improved performance in medical image translation.
arXiv Detail & Related papers (2022-07-17T15:53:24Z) - Harmonizing Pathological and Normal Pixels for Pseudo-healthy Synthesis [68.5287824124996]
We present a new type of discriminator, the segmentor, to accurately locate the lesions and improve the visual quality of pseudo-healthy images.
We apply the generated images into medical image enhancement and utilize the enhanced results to cope with the low contrast problem.
Comprehensive experiments on the T2 modality of BraTS demonstrate that the proposed method substantially outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T08:41:17Z) - Modelling Latent Translations for Cross-Lingual Transfer [47.61502999819699]
We propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model.
We evaluate our novel latent translation-based model on a series of multilingual NLU tasks.
We report gains for both zero-shot and few-shot learning setups, up to 2.7 accuracy points on average.
arXiv Detail & Related papers (2021-07-23T17:11:27Z) - Uncertainty-Guided Progressive GANs for Medical Image Translation [37.95176881950121]
Image-to-image translation plays a vital role in tackling various medical imaging tasks.
We propose an uncertainty-guided progressive learning scheme for image-to-image translation.
We demonstrate the efficacy of our model on three challenging medical image translation tasks.
arXiv Detail & Related papers (2021-06-29T16:26:12Z) - Flow-based Deformation Guidance for Unpaired Multi-Contrast MRI
Image-to-Image Translation [7.8333615755210175]
In this paper, we introduce a novel approach to unpaired image-to-image translation based on the invertible architecture.
We utilize the temporal information between consecutive slices to provide more constraints to the optimization for transforming one domain to another in unpaired medical images.
arXiv Detail & Related papers (2020-12-03T09:10:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.