Instance-Conditioned GAN Data Augmentation for Representation Learning
- URL: http://arxiv.org/abs/2303.09677v1
- Date: Thu, 16 Mar 2023 22:45:43 GMT
- Title: Instance-Conditioned GAN Data Augmentation for Representation Learning
- Authors: Pietro Astolfi, Arantxa Casanova, Jakob Verbeek, Pascal Vincent,
Adriana Romero-Soriano, Michal Drozdzal
- Abstract summary: We introduce DA_IC-GAN, a learnable data augmentation module that can be used off-the-shelf in conjunction with most state-of-the-art training recipes.
We show that DA_IC-GAN can boost accuracy to between 1%p and 2%p with the highest capacity models.
We additionally couple DA_IC-GAN with a self-supervised training recipe and show that we can also achieve an improvement of 1%p in accuracy in some settings.
- Score: 29.36473147430433
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data augmentation has become a crucial component to train state-of-the-art
visual representation models. However, handcrafting combinations of
transformations that lead to improved performances is a laborious task, which
can result in visually unrealistic samples. To overcome these limitations,
recent works have explored the use of generative models as learnable data
augmentation tools, showing promising results in narrow application domains,
e.g., few-shot learning and low-data medical imaging. In this paper, we
introduce a data augmentation module, called DA_IC-GAN, which leverages
instance-conditioned GAN generations and can be used off-the-shelf in
conjunction with most state-of-the-art training recipes. We showcase the
benefits of DA_IC-GAN by plugging it out-of-the-box into the supervised
training of ResNets and DeiT models on the ImageNet dataset, and achieving
accuracy boosts up to between 1%p and 2%p with the highest capacity models.
Moreover, the learnt representations are shown to be more robust than the
baselines when transferred to a handful of out-of-distribution datasets, and
exhibit increased invariance to variations of instance and viewpoints. We
additionally couple DA_IC-GAN with a self-supervised training recipe and show
that we can also achieve an improvement of 1%p in accuracy in some settings.
With this work, we strengthen the evidence on the potential of learnable data
augmentations to improve visual representation learning, paving the road
towards non-handcrafted augmentations in model training.
Related papers
- ADLDA: A Method to Reduce the Harm of Data Distribution Shift in Data Augmentation [11.887799310374174]
This study introduces a novel data augmentation technique, ADLDA, aimed at mitigating the negative impact of data distribution shifts.
Experimental results demonstrate that ADLDA significantly enhances model performance across multiple datasets.
arXiv Detail & Related papers (2024-05-11T03:20:35Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - Data-Centric Long-Tailed Image Recognition [49.90107582624604]
Long-tail models exhibit a strong demand for high-quality data.
Data-centric approaches aim to enhance both the quantity and quality of data to improve model performance.
There is currently a lack of research into the underlying mechanisms explaining the effectiveness of information augmentation.
arXiv Detail & Related papers (2023-11-03T06:34:37Z) - DualAug: Exploiting Additional Heavy Augmentation with OOD Data
Rejection [77.6648187359111]
We propose a novel data augmentation method, named textbfDualAug, to keep the augmentation in distribution as much as possible at a reasonable time and computational cost.
Experiments on supervised image classification benchmarks show that DualAug improve various automated data augmentation method.
arXiv Detail & Related papers (2023-10-12T08:55:10Z) - Phased Data Augmentation for Training a Likelihood-Based Generative Model with Limited Data [0.0]
Generative models excel in creating realistic images, yet their dependency on extensive datasets for training presents significant challenges.
Current data-efficient methods largely focus on GAN architectures, leaving a gap in training other types of generative models.
"phased data augmentation" is a novel technique that addresses this gap by optimizing training in limited data scenarios without altering the inherent data distribution.
arXiv Detail & Related papers (2023-05-22T03:38:59Z) - Adversarial Training Helps Transfer Learning via Better Representations [17.497590668804055]
Transfer learning aims to leverage models pre-trained on source data to efficiently adapt to target setting.
Recent works empirically demonstrate that adversarial training in the source data can improve the ability of models to transfer to new domains.
We show that adversarial training in the source data generates provably better representations, so fine-tuning on top of this representation leads to a more accurate predictor of the target data.
arXiv Detail & Related papers (2021-06-18T15:41:07Z) - Mean Embeddings with Test-Time Data Augmentation for Ensembling of
Representations [8.336315962271396]
We look at the ensembling of representations and propose mean embeddings with test-time augmentation (MeTTA)
MeTTA significantly boosts the quality of linear evaluation on ImageNet for both supervised and self-supervised models.
We believe that spreading the success of ensembles to inference higher-quality representations is the important step that will open many new applications of ensembling.
arXiv Detail & Related papers (2021-06-15T10:49:46Z) - Regularizing Generative Adversarial Networks under Limited Data [88.57330330305535]
This work proposes a regularization approach for training robust GAN models on limited data.
We show a connection between the regularized loss and an f-divergence called LeCam-divergence, which we find is more robust under limited training data.
arXiv Detail & Related papers (2021-04-07T17:59:06Z) - Negative Data Augmentation [127.28042046152954]
We show that negative data augmentation samples provide information on the support of the data distribution.
We introduce a new GAN training objective where we use NDA as an additional source of synthetic data for the discriminator.
Empirically, models trained with our method achieve improved conditional/unconditional image generation along with improved anomaly detection capabilities.
arXiv Detail & Related papers (2021-02-09T20:28:35Z) - Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting.
G-DAUGC consistently outperforms existing data augmentation methods based on back-translation.
Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.