MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis
- URL: http://arxiv.org/abs/2405.09806v2
- Date: Wed, 10 Jul 2024 04:04:06 GMT
- Title: MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis
- Authors: Joseph Cho, Cyril Zakka, Dhamanpreet Kaur, Rohan Shad, Ross Wightman, Akshay Chaudhari, William Hiesinger,
- Abstract summary: In medicine, this application promises to address the critical challenge of data scarcity.
By generating realistic and varying medical 2D and 3D images, these models offer a rich, privacy-respecting resource for algorithmic training and research.
We show significant improvement in broad medical image and video synthesis guided by text prompts.
- Score: 5.4476703187184246
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Diffusion models have recently gained significant traction due to their ability to generate high-fidelity and diverse images and videos conditioned on text prompts. In medicine, this application promises to address the critical challenge of data scarcity, a consequence of barriers in data sharing, stringent patient privacy regulations, and disparities in patient population and demographics. By generating realistic and varying medical 2D and 3D images, these models offer a rich, privacy-respecting resource for algorithmic training and research. To this end, we introduce MediSyn, a pair of instruction-tuned text-guided latent diffusion models with the ability to generate high-fidelity and diverse medical 2D and 3D images across specialties and modalities. Through established metrics, we show significant improvement in broad medical image and video synthesis guided by text prompts.
Related papers
- Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Scaling by training on large datasets has been shown to enhance the quality and fidelity of image generation and manipulation with diffusion models.
Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.
Our results demonstrate significant performance gains in various scenarios when combined with different fine-tuning schemes.
arXiv Detail & Related papers (2024-12-30T01:59:34Z) - Vision-Language Synthetic Data Enhances Echocardiography Downstream Tasks [4.1942958779358674]
This paper utilizes recent vision-language models to produce diverse and realistic synthetic echocardiography image data.
We show that the rich contextual information present in the synthesized data potentially enhances the accuracy and interpretability of downstream tasks.
arXiv Detail & Related papers (2024-03-28T23:26:45Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - DiffBoost: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model [3.890243179348094]
Large-scale, big-variant, high-quality data are crucial for developing robust and successful deep-learning models for medical applications.
This paper proposes a novel approach by developing controllable diffusion models for medical image synthesis, called DiffBoost.
We leverage recent diffusion probabilistic models to generate realistic and diverse synthetic medical image data.
arXiv Detail & Related papers (2023-10-19T16:18:02Z) - Augmenting medical image classifiers with synthetic data from latent
diffusion models [12.077733447347592]
We show that latent diffusion models can scalably generate images of skin disease.
We generate and analyze a new dataset of 458,920 synthetic images produced using several generation strategies.
arXiv Detail & Related papers (2023-08-23T22:34:49Z) - Image Captions are Natural Prompts for Text-to-Image Models [70.30915140413383]
We analyze the relationship between the training effect of synthetic data and the synthetic data distribution induced by prompts.
We propose a simple yet effective method that prompts text-to-image generative models to synthesize more informative and diverse training data.
Our method significantly improves the performance of models trained on synthetic training data.
arXiv Detail & Related papers (2023-07-17T14:38:11Z) - Medical Diffusion -- Denoising Diffusion Probabilistic Models for 3D
Medical Image Generation [0.6486409713123691]
We show that diffusion probabilistic models can synthesize high quality medical imaging data.
We provide quantitative measurements of their performance through a reader study with two medical experts.
We demonstrate that synthetic images can be used in a self-supervised pre-training and improve the performance of breast segmentation models when data is scarce.
arXiv Detail & Related papers (2022-11-07T08:37:48Z) - Is synthetic data from generative models ready for image recognition? [69.42645602062024]
We study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks.
We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks.
arXiv Detail & Related papers (2022-10-14T06:54:24Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Generative Adversarial U-Net for Domain-free Medical Image Augmentation [49.72048151146307]
The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing.
In this paper, we develop a novel generative method named generative adversarial U-Net.
Our newly designed model is domain-free and generalizable to various medical images.
arXiv Detail & Related papers (2021-01-12T23:02:26Z) - Overcoming Barriers to Data Sharing with Medical Image Generation: A
Comprehensive Evaluation [17.983449515155414]
We utilize Generative Adversarial Networks (GANs) to create derived medical imaging datasets consisting entirely of synthetic patient data.
The synthetic images ideally have, in aggregate, similar statistical properties to those of a source dataset but do not contain sensitive personal information.
We measure the synthetic image quality by the performance difference of predictive models trained on either the synthetic or the real dataset.
arXiv Detail & Related papers (2020-11-29T15:41:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.