GenMix: Effective Data Augmentation with Generative Diffusion Model Image Editing
- URL: http://arxiv.org/abs/2412.02366v3
- Date: Fri, 06 Dec 2024 00:42:40 GMT
- Title: GenMix: Effective Data Augmentation with Generative Diffusion Model Image Editing
- Authors: Khawar Islam, Muhammad Zaigham Zaheer, Arif Mahmood, Karthik Nandakumar, Naveed Akhtar,
- Abstract summary: This paper introduces GenMix, a generalizable prompt-guided generative data augmentation approach.
Our technique leverages image editing to generate augmented images based on custom conditional prompts.
Our approach mitigates unrealistic images and label ambiguity, improving the performance and adversarial robustness of the resulting models.
- Score: 37.489576508876056
- License:
- Abstract: Data augmentation is widely used to enhance generalization in visual classification tasks. However, traditional methods struggle when source and target domains differ, as in domain adaptation, due to their inability to address domain gaps. This paper introduces GenMix, a generalizable prompt-guided generative data augmentation approach that enhances both in-domain and cross-domain image classification. Our technique leverages image editing to generate augmented images based on custom conditional prompts, designed specifically for each problem type. By blending portions of the input image with its edited generative counterpart and incorporating fractal patterns, our approach mitigates unrealistic images and label ambiguity, improving the performance and adversarial robustness of the resulting models. Efficacy of our method is established with extensive experiments on eight public datasets for general and fine-grained classification, in both in-domain and cross-domain settings. Additionally, we demonstrate performance improvements for self-supervised learning, learning with data scarcity, and adversarial robustness. As compared to the existing state-of-the-art methods, our technique achieves stronger performance across the board.
Related papers
- Dataset Augmentation by Mixing Visual Concepts [3.5420134832331334]
This paper proposes a dataset augmentation method by fine-tuning pre-trained diffusion models.
We adapt the diffusion model by conditioning it with real images and novel text embeddings.
Our approach outperforms state-of-the-art augmentation techniques on benchmark classification tasks.
arXiv Detail & Related papers (2024-12-19T19:42:22Z) - Domain Generalized Recaptured Screen Image Identification Using SWIN Transformer [1.024113475677323]
We propose a cascaded data augmentation and SWIN transformer domain generalization framework (DAST-DG)
A feature generator is trained to make authentic images from various domains indistinguishable.
This process is then applied to recaptured images, creating a dual adversarial learning setup.
arXiv Detail & Related papers (2024-07-24T11:22:02Z) - CycleMix: Mixing Source Domains for Domain Generalization in Style-Dependent Data [5.124256074746721]
In the case of image classification, one frequent reason that algorithms fail to generalize is that they rely on spurious correlations present in training data.
These associations may not be present in the unseen test data, leading to significant degradation of their effectiveness.
In this work, we attempt to mitigate this Domain Generalization problem by training a robust feature extractor which disregards features attributed to image-style but infers based on style-invariant image representations.
arXiv Detail & Related papers (2024-07-18T11:43:26Z) - Complex Style Image Transformations for Domain Generalization in Medical Images [6.635679521775917]
Domain generalization techniques aim to approach unknown domains from a single data source.
In this paper we introduce a novel framework, named CompStyle, which leverages style transfer and adversarial training.
We provide results from experiments on semantic segmentation on prostate data and corruption robustness on cardiac data.
arXiv Detail & Related papers (2024-06-01T04:57:31Z) - Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models.
In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques.
We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z) - Fine-grained Recognition with Learnable Semantic Data Augmentation [68.48892326854494]
Fine-grained image recognition is a longstanding computer vision challenge.
We propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem.
Our method significantly improves the generalization performance on several popular classification networks.
arXiv Detail & Related papers (2023-09-01T11:15:50Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - A Novel Cross-Perturbation for Single Domain Generalization [54.612933105967606]
Single domain generalization aims to enhance the ability of the model to generalize to unknown domains when trained on a single source domain.
The limited diversity in the training data hampers the learning of domain-invariant features, resulting in compromised generalization performance.
We propose CPerb, a simple yet effective cross-perturbation method to enhance the diversity of the training data.
arXiv Detail & Related papers (2023-08-02T03:16:12Z) - Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models.
Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples.
We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z) - CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency [119.45667331836583]
Unsupervised domain adaptation algorithms aim to transfer the knowledge learned from one domain to another.
We present a novel pixel-wise adversarial domain adaptation algorithm.
arXiv Detail & Related papers (2020-01-09T19:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.