Diffusion-based Data Augmentation for Skin Disease Classification:
Impact Across Original Medical Datasets to Fully Synthetic Images
- URL: http://arxiv.org/abs/2301.04802v1
- Date: Thu, 12 Jan 2023 04:22:23 GMT
- Title: Diffusion-based Data Augmentation for Skin Disease Classification:
Impact Across Original Medical Datasets to Fully Synthetic Images
- Authors: Mohamed Akrout, B\'alint Gyepesi, P\'eter Holl\'o, Adrienn Po\'or,
Bl\'aga Kincs\H{o}, Stephen Solis, Katrina Cirone, Jeremy Kawahara, Dekker
Slade, Latif Abid, M\'at\'e Kov\'acs, Istv\'an Fazekas
- Abstract summary: Deep neural networks still rely on large amounts of training data to avoid overfitting.
Labeled training data for real-world applications such as healthcare is limited and difficult to access.
We build upon the emerging success of text-to-image diffusion probabilistic models in augmenting the training samples of our macroscopic skin disease dataset.
- Score: 2.5075774184834803
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite continued advancement in recent years, deep neural networks still
rely on large amounts of training data to avoid overfitting. However, labeled
training data for real-world applications such as healthcare is limited and
difficult to access given longstanding privacy, and strict data sharing
policies. By manipulating image datasets in the pixel or feature space,
existing data augmentation techniques represent one of the effective ways to
improve the quantity and diversity of training data. Here, we look to advance
augmentation techniques by building upon the emerging success of text-to-image
diffusion probabilistic models in augmenting the training samples of our
macroscopic skin disease dataset. We do so by enabling fine-grained control of
the image generation process via input text prompts. We demonstrate that this
generative data augmentation approach successfully maintains a similar
classification accuracy of the visual classifier even when trained on a fully
synthetic skin disease dataset. Similar to recent applications of generative
models, our study suggests that diffusion models are indeed effective in
generating high-quality skin images that do not sacrifice the classifier
performance, and can improve the augmentation of training datasets after
curation.
Related papers
- Local Lesion Generation is Effective for Capsule Endoscopy Image Data Augmentation in a Limited Data Setting [0.0]
We propose and evaluate two local lesion generation approaches to address the challenge of augmenting small medical image datasets.
The first approach employs the Poisson Image Editing algorithm, a classical image processing technique, to create realistic image composites.
The second approach introduces a novel generative method, leveraging a fine-tuned Image Inpainting GAN to synthesize realistic lesions.
arXiv Detail & Related papers (2024-11-05T13:44:25Z) - DataDream: Few-shot Guided Dataset Generation [90.09164461462365]
We propose a framework for synthesizing classification datasets that more faithfully represents the real data distribution.
DataDream fine-tunes LoRA weights for the image generation model on the few real images before generating the training data using the adapted model.
We then fine-tune LoRA weights for CLIP using the synthetic data to improve downstream image classification over previous approaches on a large variety of datasets.
arXiv Detail & Related papers (2024-07-15T17:10:31Z) - TSynD: Targeted Synthetic Data Generation for Enhanced Medical Image Classification [0.011037620731410175]
This work aims to guide the generative model to synthesize data with high uncertainty.
We alter the feature space of the autoencoder through an optimization process.
We improve the robustness against test time data augmentations and adversarial attacks on several classifications tasks.
arXiv Detail & Related papers (2024-06-25T11:38:46Z) - Improving Deep Learning-based Automatic Cranial Defect Reconstruction by Heavy Data Augmentation: From Image Registration to Latent Diffusion Models [0.2911706166691895]
The work is a considerable contribution to the field of artificial intelligence in the automatic modeling of personalized cranial implants.
We show that the use of heavy data augmentation significantly increases both the quantitative and qualitative outcomes.
We also show that the synthetically augmented network successfully reconstructs real clinical defects.
arXiv Detail & Related papers (2024-06-10T15:34:23Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images
with Free Attention Masks [64.67735676127208]
Text-to-image diffusion models have shown great potential for benefiting image recognition.
Although promising, there has been inadequate exploration dedicated to unsupervised learning on diffusion-generated images.
We introduce customized solutions by fully exploiting the aforementioned free attention masks.
arXiv Detail & Related papers (2023-08-13T10:07:46Z) - Performance of GAN-based augmentation for deep learning COVID-19 image
classification [57.1795052451257]
The biggest challenge in the application of deep learning to the medical domain is the availability of training data.
Data augmentation is a typical methodology used in machine learning when confronted with a limited data set.
In this work, a StyleGAN2-ADA model of Generative Adversarial Networks is trained on the limited COVID-19 chest X-ray image set.
arXiv Detail & Related papers (2023-04-18T15:39:58Z) - Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models.
Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples.
We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z) - Improving dermatology classifiers across populations using images
generated by large diffusion models [4.291548465691441]
We show that DALL$cdot$E 2, a large-scale text-to-image diffusion model, can produce photorealistic images of skin disease across skin types.
We demonstrate that augmenting training data with DALL$cdot$E 2-generated synthetic images improves classification of skin disease overall and especially for underrepresented groups.
arXiv Detail & Related papers (2022-11-23T23:53:03Z) - Evaluation of Deep Convolutional Generative Adversarial Networks for
data augmentation of chest X-ray images [0.0]
Medical image datasets are usually imbalanced, due to the high costs of obtaining the data and time-consuming annotations.
In this work, we performed data augmentation on the Chest X-rays dataset through generative modeling.
arXiv Detail & Related papers (2020-09-02T16:43:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.