Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA
- URL: http://arxiv.org/abs/2502.20667v1
- Date: Fri, 28 Feb 2025 02:49:45 GMT
- Title: Advancing AI-Powered Medical Image Synthesis: Insights from MedVQA-GI Challenge Using CLIP, Fine-Tuned Stable Diffusion, and Dream-Booth + LoRA
- Authors: Ojonugwa Oluwafemi Ejiga Peter, Md Mahmudur Rahman, Fahmi Khalifa,
- Abstract summary: The MEDVQA-GI challenge addresses the integration of AI-driven text-to-image generative models in medical diagnostics.<n>This study introduces a novel approach based on fine-tuned generative models to generate dynamic, scalable, and precise images from textual descriptions.<n>Our system integrates Stable Diffusion and DreamBooth models, as well as Low-Rank Adaptation (LORA) to generate high-fidelity medical images.
- Score: 3.290817968509798
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The MEDVQA-GI challenge addresses the integration of AI-driven text-to-image generative models in medical diagnostics, aiming to enhance diagnostic capabilities through synthetic image generation. Existing methods primarily focus on static image analysis and lack the dynamic generation of medical imagery from textual descriptions. This study intends to partially close this gap by introducing a novel approach based on fine-tuned generative models to generate dynamic, scalable, and precise images from textual descriptions. Particularly, our system integrates fine-tuned Stable Diffusion and DreamBooth models, as well as Low-Rank Adaptation (LORA), to generate high-fidelity medical images. The problem is around two sub-tasks namely: image synthesis (IS) and optimal prompt production (OPG). The former creates medical images via verbal prompts, whereas the latter provides prompts that produce high-quality images in specified categories. The study emphasizes the limitations of traditional medical image generation methods, such as hand sketching, constrained datasets, static procedures, and generic models. Our evaluation measures showed that Stable Diffusion surpasses CLIP and DreamBooth + LORA in terms of producing high-quality, diversified images. Specifically, Stable Diffusion had the lowest Fr\'echet Inception Distance (FID) scores (0.099 for single center, 0.064 for multi-center, and 0.067 for combined), indicating higher image quality. Furthermore, it had the highest average Inception Score (2.327 across all datasets), indicating exceptional diversity and quality. This advances the field of AI-powered medical diagnosis. Future research will concentrate on model refining, dataset augmentation, and ethical considerations for efficiently implementing these advances into clinical practice
Related papers
- PRISM: High-Resolution & Precise Counterfactual Medical Image Generation using Language-guided Stable Diffusion [5.504796147401483]
Development of reliable and generalizable deep learning systems for medical imaging faces significant obstacles due to spurious correlations, data imbalances, and limited text annotations in datasets.
We present PRISM, a framework that leverages foundation models to generate high-resolution, language-guided medical image counterfactuals.
arXiv Detail & Related papers (2025-02-28T21:32:08Z) - Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis [55.959002385347645]
Scaling by training on large datasets has been shown to enhance the quality and fidelity of image generation and manipulation with diffusion models.<n>Latent Drifting enables diffusion models to be conditioned for medical images fitted for the complex task of counterfactual image generation.<n>Our results demonstrate significant performance gains in various scenarios when combined with different fine-tuning schemes.
arXiv Detail & Related papers (2024-12-30T01:59:34Z) - StealthDiffusion: Towards Evading Diffusion Forensic Detection through Diffusion Model [62.25424831998405]
StealthDiffusion is a framework that modifies AI-generated images into high-quality, imperceptible adversarial examples.
It is effective in both white-box and black-box settings, transforming AI-generated images into high-quality adversarial forgeries.
arXiv Detail & Related papers (2024-08-11T01:22:29Z) - Adapting Visual-Language Models for Generalizable Anomaly Detection in Medical Images [68.42215385041114]
This paper introduces a novel lightweight multi-level adaptation and comparison framework to repurpose the CLIP model for medical anomaly detection.
Our approach integrates multiple residual adapters into the pre-trained visual encoder, enabling a stepwise enhancement of visual features across different levels.
Our experiments on medical anomaly detection benchmarks demonstrate that our method significantly surpasses current state-of-the-art models.
arXiv Detail & Related papers (2024-03-19T09:28:19Z) - A Domain Translation Framework with an Adversarial Denoising Diffusion
Model to Generate Synthetic Datasets of Echocardiography Images [0.5999777817331317]
We introduce a framework to create echocardiography images suitable to be used for clinical research purposes.
For several domain translation operations, the results verified that such generative model was able to synthesize high quality image samples.
arXiv Detail & Related papers (2024-03-07T15:58:03Z) - GAN-GA: A Generative Model based on Genetic Algorithm for Medical Image
Generation [0.0]
Generative models offer a promising solution for addressing medical image shortage problems.
This paper proposes the GAN-GA, a generative model optimized by embedding a genetic algorithm.
The proposed model enhances image fidelity and diversity while preserving distinctive features.
arXiv Detail & Related papers (2023-12-30T20:16:45Z) - DiffBoost: Enhancing Medical Image Segmentation via Text-Guided Diffusion Model [3.890243179348094]
Large-scale, big-variant, high-quality data are crucial for developing robust and successful deep-learning models for medical applications.<n>This paper proposes a novel approach by developing controllable diffusion models for medical image synthesis, called DiffBoost.<n>We leverage recent diffusion probabilistic models to generate realistic and diverse synthetic medical image data.
arXiv Detail & Related papers (2023-10-19T16:18:02Z) - Conditional Diffusion Models for Semantic 3D Brain MRI Synthesis [0.0]
We introduce Med-DDPM, a diffusion model designed for 3D semantic brain MRI synthesis.
It effectively tackles data scarcity and privacy issues by integrating semantic conditioning.
It generates diverse, coherent images with high visual fidelity.
arXiv Detail & Related papers (2023-05-29T04:14:38Z) - MedSegDiff-V2: Diffusion based Medical Image Segmentation with
Transformer [53.575573940055335]
We propose a novel Transformer-based Diffusion framework, called MedSegDiff-V2.
We verify its effectiveness on 20 medical image segmentation tasks with different image modalities.
arXiv Detail & Related papers (2023-01-19T03:42:36Z) - Adapting Pretrained Vision-Language Foundational Models to Medical
Imaging Domains [3.8137985834223502]
Building generative models for medical images that faithfully depict clinical context may help alleviate the paucity of healthcare datasets.
We explore the sub-components of the Stable Diffusion pipeline to fine-tune the model to generate medical images.
Our best-performing model improves upon the stable diffusion baseline and can be conditioned to insert a realistic-looking abnormality on a synthetic radiology image.
arXiv Detail & Related papers (2022-10-09T01:43:08Z) - Harmonizing Pathological and Normal Pixels for Pseudo-healthy Synthesis [68.5287824124996]
We present a new type of discriminator, the segmentor, to accurately locate the lesions and improve the visual quality of pseudo-healthy images.
We apply the generated images into medical image enhancement and utilize the enhanced results to cope with the low contrast problem.
Comprehensive experiments on the T2 modality of BraTS demonstrate that the proposed method substantially outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T08:41:17Z) - Generative Adversarial U-Net for Domain-free Medical Image Augmentation [49.72048151146307]
The shortage of annotated medical images is one of the biggest challenges in the field of medical image computing.
In this paper, we develop a novel generative method named generative adversarial U-Net.
Our newly designed model is domain-free and generalizable to various medical images.
arXiv Detail & Related papers (2021-01-12T23:02:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.