Boosting Dermatoscopic Lesion Segmentation via Diffusion Models with
Visual and Textual Prompts
- URL: http://arxiv.org/abs/2310.02906v1
- Date: Wed, 4 Oct 2023 15:43:26 GMT
- Title: Boosting Dermatoscopic Lesion Segmentation via Diffusion Models with
Visual and Textual Prompts
- Authors: Shiyi Du, Xiaosong Wang, Yongyi Lu, Yuyin Zhou, Shaoting Zhang, Alan
Yuille, Kang Li, and Zongwei Zhou
- Abstract summary: We adapt the latest advance in the generative model, with the added control flow using lesion-specific visual and textual prompts.
It can achieve a 9% increase in the SSIM image quality measure and an over 5% increase in Dice coefficients over the prior arts.
- Score: 27.222844687360823
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Image synthesis approaches, e.g., generative adversarial networks, have been
popular as a form of data augmentation in medical image analysis tasks. It is
primarily beneficial to overcome the shortage of publicly accessible data and
associated quality annotations. However, the current techniques often lack
control over the detailed contents in generated images, e.g., the type of
disease patterns, the location of lesions, and attributes of the diagnosis. In
this work, we adapt the latest advance in the generative model, i.e., the
diffusion model, with the added control flow using lesion-specific visual and
textual prompts for generating dermatoscopic images. We further demonstrate the
advantage of our diffusion model-based framework over the classical generation
models in both the image quality and boosting the segmentation performance on
skin lesions. It can achieve a 9% increase in the SSIM image quality measure
and an over 5% increase in Dice coefficients over the prior arts.
Related papers
- Image Synthesis with Class-Aware Semantic Diffusion Models for Surgical Scene Segmentation [3.6723640056915436]
We propose the Class-Aware Semantic Diffusion Model (CASDM) to tackle data scarcity and imbalance.
Class-aware mean squared error and class-aware self-perceptual loss functions have been defined to prioritize critical, less visible classes.
We are the first to generate multi-class segmentation maps using text prompts in a novel fashion to specify their contents.
arXiv Detail & Related papers (2024-10-31T14:14:30Z) - Deep Generative Models Unveil Patterns in Medical Images Through Vision-Language Conditioning [3.4299097748670255]
Deep generative models have significantly advanced medical imaging analysis by enhancing dataset size and quality.
We employ a generative structure with hybrid conditions, combining clinical data and segmentation masks to guide the image synthesis process.
Our approach differs from and presents a more challenging task than traditional medical report-guided synthesis due to the less visual correlation of our clinical information with the images.
arXiv Detail & Related papers (2024-10-17T17:48:36Z) - Synthetic Generation of Dermatoscopic Images with GAN and Closed-Form Factorization [12.753792457271953]
We propose an innovative unsupervised augmentation solution that harnesses Generative Adversarial Network (GAN) based models.
We created synthetic images to incorporate the semantic variations and augmented the training data with these images.
We were able to increase the performance of machine learning models and set a new benchmark amongst non-ensemble based models in skin lesion classification.
arXiv Detail & Related papers (2024-10-07T15:09:50Z) - DEEM: Diffusion Models Serve as the Eyes of Large Language Models for Image Perception [66.88792390480343]
We propose DEEM, a simple but effective approach that utilizes the generative feedback of diffusion models to align the semantic distributions of the image encoder.
DEEM exhibits enhanced robustness and a superior capacity to alleviate model hallucinations while utilizing fewer trainable parameters, less pre-training data, and a smaller base model size.
arXiv Detail & Related papers (2024-05-24T05:46:04Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - Learned representation-guided diffusion models for large-image generation [58.192263311786824]
We introduce a novel approach that trains diffusion models conditioned on embeddings from self-supervised learning (SSL)
Our diffusion models successfully project these features back to high-quality histopathology and remote sensing images.
Augmenting real data by generating variations of real images improves downstream accuracy for patch-level and larger, image-scale classification tasks.
arXiv Detail & Related papers (2023-12-12T14:45:45Z) - EMIT-Diff: Enhancing Medical Image Segmentation via Text-Guided
Diffusion Model [4.057796755073023]
We develop controllable diffusion models for medical image synthesis, called EMIT-Diff.
We leverage recent diffusion probabilistic models to generate realistic and diverse synthetic medical image data.
In our approach, we ensure that the synthesized samples adhere to medically relevant constraints.
arXiv Detail & Related papers (2023-10-19T16:18:02Z) - Performance of GAN-based augmentation for deep learning COVID-19 image
classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data.
Data augmentation is a typical methodology used in machine learning when confronted with a limited data set.
In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z) - Diffusion Models as Masked Autoencoders [52.442717717898056]
We revisit generatively pre-training visual representations in light of recent interest in denoising diffusion models.
While directly pre-training with diffusion models does not produce strong representations, we condition diffusion models on masked input and formulate diffusion models as masked autoencoders (DiffMAE)
We perform a comprehensive study on the pros and cons of design choices and build connections between diffusion models and masked autoencoders.
arXiv Detail & Related papers (2023-04-06T17:59:56Z) - Diffusion-based Data Augmentation for Skin Disease Classification:
Impact Across Original Medical Datasets to Fully Synthetic Images [2.5075774184834803]
Deep neural networks still rely on large amounts of training data to avoid overfitting.
Labeled training data for real-world applications such as healthcare is limited and difficult to access.
We build upon the emerging success of text-to-image diffusion probabilistic models in augmenting the training samples of our macroscopic skin disease dataset.
arXiv Detail & Related papers (2023-01-12T04:22:23Z) - Variational Knowledge Distillation for Disease Classification in Chest
X-Rays [102.04931207504173]
We propose itvariational knowledge distillation (VKD), which is a new probabilistic inference framework for disease classification based on X-rays.
We demonstrate the effectiveness of our method on three public benchmark datasets with paired X-ray images and EHRs.
arXiv Detail & Related papers (2021-03-19T14:13:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.