Related papers: DIAGen: Diverse Image Augmentation with Generative Models

DIAGen: Diverse Image Augmentation with Generative Models

URL: http://arxiv.org/abs/2408.14584v1
Date: Mon, 26 Aug 2024 19:09:13 GMT
Title: DIAGen: Diverse Image Augmentation with Generative Models
Authors: Tobias Lingenberg, Markus Reuter, Gopika Sudhakaran, Dominik Gojny, Stefan Roth, Simone Schaub-Meyer,
Abstract summary: We propose DIAGen to enhance semantic diversity in computer vision models. We exploit the general knowledge of a text-to-text generative model to guide the image generation of the diffusion model. Results show that DIAGen not only enhances semantic diversity but also improves the performance of subsequent classifiers.
Score: 9.79392997282545
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Simple data augmentation techniques, such as rotations and flips, are widely used to enhance the generalization power of computer vision models. However, these techniques often fail to modify high-level semantic attributes of a class. To address this limitation, researchers have explored generative augmentation methods like the recently proposed DA-Fusion. Despite some progress, the variations are still largely limited to textural changes, thus falling short on aspects like varied viewpoints, environment, weather conditions, or even class-level semantic attributes (eg, variations in a dog's breed). To overcome this challenge, we propose DIAGen, building upon DA-Fusion. First, we apply Gaussian noise to the embeddings of an object learned with Textual Inversion to diversify generations using a pre-trained diffusion model's knowledge. Second, we exploit the general knowledge of a text-to-text generative model to guide the image generation of the diffusion model with varied class-specific prompts. Finally, we introduce a weighting mechanism to mitigate the impact of poorly generated samples. Experimental results across various datasets show that DIAGen not only enhances semantic diversity but also improves the performance of subsequent classifiers. The advantages of DIAGen over standard augmentations and the DA-Fusion baseline are particularly pronounced with out-of-distribution samples.

Related papers

Enhance Image Classification via Inter-Class Image Mixup with Diffusion Model [80.61157097223058]
A prevalent strategy to bolster image classification performance is through augmenting the training set with synthetic images generated by T2I models. In this study, we scrutinize the shortcomings of both current generative and conventional data augmentation techniques. We introduce an innovative inter-class data augmentation method known as Diff-Mix, which enriches the dataset by performing image translations between classes.
arXiv Detail & Related papers (2024-03-28T17:23:45Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning [50.7702397913573]
The rapid advancement of photorealistic generators has reached a critical juncture where the discrepancy between authentic and manipulated images is increasingly indistinguishable. Although there have been a number of publicly available face forgery datasets, the forgery faces are mostly generated using GAN-based synthesis technology. We propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection.
arXiv Detail & Related papers (2024-02-03T03:13:50Z)
Class-Prototype Conditional Diffusion Model with Gradient Projection for Continual Learning [20.175586324567025]
Mitigating catastrophic forgetting is a key hurdle in continual learning. A major issue is the deterioration in the quality of generated data compared to the original. We propose a GR-based approach for continual learning that enhances image quality in generators.
arXiv Detail & Related papers (2023-12-10T17:39:42Z)
Improving Out-of-Distribution Robustness of Classifiers via Generative Interpolation [56.620403243640396]
Deep neural networks achieve superior performance for learning from independent and identically distributed (i.i.d.) data. However, their performance deteriorates significantly when handling out-of-distribution (OoD) data. We develop a simple yet effective method called Generative Interpolation to fuse generative models trained from multiple domains for synthesizing diverse OoD samples.
arXiv Detail & Related papers (2023-07-23T03:53:53Z)
DomainStudio: Fine-Tuning Diffusion Models for Domain-Driven Image Generation using Limited Data [20.998032566820907]
This paper proposes a novel DomainStudio approach to adapt DDPMs pre-trained on large-scale source datasets to target domains using limited data. It is designed to keep the diversity of subjects provided by source domains and get high-quality and diverse adapted samples in target domains.
arXiv Detail & Related papers (2023-06-25T07:40:39Z)
Analyzing Bias in Diffusion-based Face Generation Models [75.80072686374564]
Diffusion models are increasingly popular in synthetic data generation and image editing applications. We investigate the presence of bias in diffusion-based face generation models with respect to attributes such as gender, race, and age. We examine how dataset size affects the attribute composition and perceptual quality of both diffusion and Generative Adversarial Network (GAN) based face generation models.
arXiv Detail & Related papers (2023-05-10T18:22:31Z)
Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models. Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples. We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z)
GENIE: Large Scale Pre-training for Text Generation with Diffusion Model [86.2022500090247]
GENIE is a sequence-to-sequence text generation model which combines Transformer and diffusion. We propose a novel pre-training method named continuous paragraph denoise based on the characteristics of the diffusion model.
arXiv Detail & Related papers (2022-12-22T13:17:11Z)
A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models. They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space. This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.