Related papers: EMIT-Diff: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model

EMIT-Diff: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model

URL: http://arxiv.org/abs/2310.12868v1
Date: Thu, 19 Oct 2023 16:18:02 GMT
Title: EMIT-Diff: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model
Authors: Zheyuan Zhang, Lanhong Yao, Bin Wang, Debesh Jha, Elif Keles, Alpay Medetalibeyoglu, and Ulas Bagci
Abstract summary: We develop controllable diffusion models for medical image synthesis, called EMIT-Diff. We leverage recent diffusion probabilistic models to generate realistic and diverse synthetic medical image data. In our approach, we ensure that the synthesized samples adhere to medically relevant constraints.
Score: 4.057796755073023
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large-scale, big-variant, and high-quality data are crucial for developing robust and successful deep-learning models for medical applications since they potentially enable better generalization performance and avoid overfitting. However, the scarcity of high-quality labeled data always presents significant challenges. This paper proposes a novel approach to address this challenge by developing controllable diffusion models for medical image synthesis, called EMIT-Diff. We leverage recent diffusion probabilistic models to generate realistic and diverse synthetic medical image data that preserve the essential characteristics of the original medical images by incorporating edge information of objects to guide the synthesis process. In our approach, we ensure that the synthesized samples adhere to medically relevant constraints and preserve the underlying structure of imaging data. Due to the random sampling process by the diffusion model, we can generate an arbitrary number of synthetic images with diverse appearances. To validate the effectiveness of our proposed method, we conduct an extensive set of medical image segmentation experiments on multiple datasets, including Ultrasound breast (+13.87%), CT spleen (+0.38%), and MRI prostate (+7.78%), achieving significant improvements over the baseline segmentation methods. For the first time, to our best knowledge, the promising results demonstrate the effectiveness of our EMIT-Diff for medical image segmentation tasks and show the feasibility of introducing a first-ever text-guided diffusion model for general medical image segmentation tasks. With carefully designed ablation experiments, we investigate the influence of various data augmentation ratios, hyper-parameter settings, patch size for generating random merging mask settings, and combined influence with different network architectures.

Related papers

Causal Disentanglement for Robust Long-tail Medical Image Generation [80.15257897500578]
We propose a novel medical image generation framework, which generates independent pathological and structural features. We leverage a diffusion model guided by pathological findings to model pathological features, enabling the generation of diverse counterfactual images.
arXiv Detail & Related papers (2025-04-20T01:54:18Z)
Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation. We evaluate our method on three public longitudinal benchmark datasets of brain MRI and chest X-rays for counterfactual image generation.
arXiv Detail & Related papers (2024-12-30T01:59:34Z)
MRGen: Segmentation Data Engine For Underrepresented MRI Modalities [59.61465292965639]
Training medical image segmentation models for rare yet clinically significant imaging modalities is challenging due to the scarcity of annotated data. This paper investigates leveraging generative models to synthesize training data, to train segmentation models for underrepresented modalities.
arXiv Detail & Related papers (2024-12-04T16:34:22Z)
Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning [3.4299097748670255]
Deep generative models have significantly advanced medical imaging analysis by enhancing dataset size and quality. We employ a generative structure with hybrid conditions, combining clinical data and segmentation masks to guide the image synthesis process. Our approach differs from and presents a more challenging task than traditional medical report-guided synthesis due to the less visual correlation of our clinical information with the images.
arXiv Detail & Related papers (2024-10-17T17:48:36Z)
Evaluating Utility of Memory Efficient Medical Image Generation: A Study on Lung Nodule Segmentation [0.0]
This work proposes a memory-efficient patch-wise denoising diffusion probabilistic model (DDPM) for generating synthetic medical images. Our approach generates high-utility synthetic images with nodule segmentation while efficiently managing memory constraints. We evaluate the method in two scenarios: training a segmentation model exclusively on synthetic data, and augmenting real-world training data with synthetic images.
arXiv Detail & Related papers (2024-10-16T13:20:57Z)
Discriminative Hamiltonian Variational Autoencoder for Accurate Tumor Segmentation in Data-Scarce Regimes [2.8498944632323755]
We propose an end-to-end hybrid architecture for medical image segmentation. We use Hamiltonian Variational Autoencoders (HVAE) and a discriminative regularization to improve the quality of generated images. Our architecture operates on a slice-by-slice basis to segment 3D volumes, capitilizing on the richly augmented dataset.
arXiv Detail & Related papers (2024-06-17T15:42:08Z)
3D MRI Synthesis with Slice-Based Latent Diffusion Models: Improving Tumor Segmentation Tasks in Data-Scarce Regimes [2.8498944632323755]
We propose a novel slice-based latent diffusion architecture to address the complexities of volumetric data generation. This approach extends the joint distribution modeling of medical images and their associated masks, allowing a simultaneous generation of both under data-scarce regimes. Our architecture can be conditioned by tumor characteristics, including size, shape, and relative position, thereby providing a diverse range of tumor variations.
arXiv Detail & Related papers (2024-06-08T09:53:45Z)
MediSyn: A Generalist Text-Guided Latent Diffusion Model For Diverse Medical Image Synthesis [4.541407789437896]
MediSyn is a text-guided latent diffusion model capable of generating synthetic images from 6 medical specialties and 10 image types. A direct comparison of the synthetic images against the real images confirms that our model synthesizes novel images and, crucially, may preserve patient privacy. Our findings highlight the immense potential for generalist image generative models to accelerate algorithmic research and development in medicine.
arXiv Detail & Related papers (2024-05-16T04:28:44Z)
Introducing Shape Prior Module in Diffusion Model for Medical Image Segmentation [7.7545714516743045]
We propose an end-to-end framework called VerseDiff-UNet, which leverages the denoising diffusion probabilistic model (DDPM) Our approach integrates the diffusion model into a standard U-shaped architecture. We evaluate our method on a single dataset of spine images acquired through X-ray imaging.
arXiv Detail & Related papers (2023-09-12T03:05:00Z)
ArSDM: Colonoscopy Images Synthesis with Adaptive Refinement Semantic Diffusion Models [69.9178140563928]
Colonoscopy analysis is essential for assisting clinical diagnosis and treatment. The scarcity of annotated data limits the effectiveness and generalization of existing methods. We propose an Adaptive Refinement Semantic Diffusion Model (ArSDM) to generate colonoscopy images that benefit the downstream tasks.
arXiv Detail & Related papers (2023-09-03T07:55:46Z)
Mask-conditioned latent diffusion for generating gastrointestinal polyp images [2.027538200191349]
This study proposes a conditional DPM framework to generate synthetic GI polyp images conditioned on given segmentation masks. Our system can generate an unlimited number of high-fidelity synthetic polyp images with the corresponding ground truth masks of polyps. Results show that the best micro-imagewise IOU of 0.7751 was achieved from DeepLabv3+ when the training data consists of both real data and synthetic data.
arXiv Detail & Related papers (2023-04-11T14:11:17Z)
Ambiguous Medical Image Segmentation using Diffusion Models [60.378180265885945]
We introduce a single diffusion model-based approach that produces multiple plausible outputs by learning a distribution over group insights. Our proposed model generates a distribution of segmentation masks by leveraging the inherent sampling process of diffusion. Comprehensive results show that our proposed approach outperforms existing state-of-the-art ambiguous segmentation networks.
arXiv Detail & Related papers (2023-04-10T17:58:22Z)
Diff-UNet: A Diffusion Embedded Network for Volumetric Segmentation [41.608617301275935]
We propose a novel end-to-end framework, called Diff-UNet, for medical volumetric segmentation. Our approach integrates the diffusion model into a standard U-shaped architecture to extract semantic information from the input volume effectively. We evaluate our method on three datasets, including multimodal brain tumors in MRI, liver tumors, and multi-organ CT volumes.
arXiv Detail & Related papers (2023-03-18T04:06:18Z)
AMIGO: Sparse Multi-Modal Graph Transformer with Shared-Context Processing for Representation Learning of Giga-pixel Images [53.29794593104923]
We present a novel concept of shared-context processing for whole slide histopathology images. AMIGO uses the celluar graph within the tissue to provide a single representation for a patient. We show that our model is strongly robust to missing information to an extent that it can achieve the same performance with as low as 20% of the data.
arXiv Detail & Related papers (2023-03-01T23:37:45Z)
OADAT: Experimental and Synthetic Clinical Optoacoustic Data for Standardized Image Processing [62.993663757843464]
Optoacoustic (OA) imaging is based on excitation of biological tissues with nanosecond-duration laser pulses followed by detection of ultrasound waves generated via light-absorption-mediated thermoelastic expansion. OA imaging features a powerful combination between rich optical contrast and high resolution in deep tissues. No standardized datasets generated with different types of experimental set-up and associated processing methods are available to facilitate advances in broader applications of OA in clinical settings.
arXiv Detail & Related papers (2022-06-17T08:11:26Z)
Harmonizing Pathological and Normal Pixels for Pseudo-healthy Synthesis [68.5287824124996]
We present a new type of discriminator, the segmentor, to accurately locate the lesions and improve the visual quality of pseudo-healthy images. We apply the generated images into medical image enhancement and utilize the enhanced results to cope with the low contrast problem. Comprehensive experiments on the T2 modality of BraTS demonstrate that the proposed method substantially outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T08:41:17Z)
Generative Adversarial U-Net for Domain-free Medical Image Augmentation [49.72048151146307]
The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing. In this paper, we develop a novel generative method named generative adversarial U-Net. Our newly designed model is domain-free and generalizable to various medical images.
arXiv Detail & Related papers (2021-01-12T23:02:26Z)
Pathological Retinal Region Segmentation From OCT Images Using Geometric Relation Based Augmentation [84.7571086566595]
We propose improvements over previous GAN-based medical image synthesis methods by jointly encoding the intrinsic relationship of geometry and shape. The proposed method outperforms state-of-the-art segmentation methods on the public RETOUCH dataset having images captured from different acquisition procedures.
arXiv Detail & Related papers (2020-03-31T11:50:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.