ReSmooth: Detecting and Utilizing OOD Samples when Training with Data
Augmentation
- URL: http://arxiv.org/abs/2205.12606v1
- Date: Wed, 25 May 2022 09:29:27 GMT
- Title: ReSmooth: Detecting and Utilizing OOD Samples when Training with Data
Augmentation
- Authors: Chenyang Wang, Junjun Jiang, Xiong Zhou, Xianming Liu
- Abstract summary: Recent DA techniques always meet the need for diversity in augmented training samples.
An augmentation strategy that has a high diversity usually introduces out-of-distribution (OOD) augmented samples.
We propose ReSmooth, a framework that firstly detects OOD samples in augmented samples and then leverages them.
- Score: 57.38418881020046
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data augmentation (DA) is a widely used technique for enhancing the training
of deep neural networks. Recent DA techniques which achieve state-of-the-art
performance always meet the need for diversity in augmented training samples.
However, an augmentation strategy that has a high diversity usually introduces
out-of-distribution (OOD) augmented samples and these samples consequently
impair the performance. To alleviate this issue, we propose ReSmooth, a
framework that firstly detects OOD samples in augmented samples and then
leverages them. To be specific, we first use a Gaussian mixture model to fit
the loss distribution of both the original and augmented samples and
accordingly split these samples into in-distribution (ID) samples and OOD
samples. Then we start a new training where ID and OOD samples are incorporated
with different smooth labels. By treating ID samples and OOD samples unequally,
we can make better use of the diverse augmented data. Further, we incorporate
our ReSmooth framework with negative data augmentation strategies. By properly
handling their intentionally created ODD samples, the classification
performance of negative data augmentations is largely ameliorated. Experiments
on several classification benchmarks show that ReSmooth can be easily extended
to existing augmentation strategies (such as RandAugment, rotate, and jigsaw)
and improve on them.
Related papers
- ScoreMix: A Scalable Augmentation Strategy for Training GANs with
Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available.
We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z) - Towards Robust Visual Question Answering: Making the Most of Biased
Samples via Contrastive Learning [54.61762276179205]
We propose a novel contrastive learning approach, MMBS, for building robust VQA models by Making the Most of Biased Samples.
Specifically, we construct positive samples for contrastive learning by eliminating the information related to spurious correlation from the original training samples.
We validate our contributions by achieving competitive performance on the OOD dataset VQA-CP v2 while preserving robust performance on the ID dataset VQA v2.
arXiv Detail & Related papers (2022-10-10T11:05:21Z) - When Chosen Wisely, More Data Is What You Need: A Universal
Sample-Efficient Strategy For Data Augmentation [19.569164094496955]
We present a universal Data Augmentation (DA) technique, called Glitter, to overcome both issues.
Glitter adaptively selects a subset of worst-case samples with maximal loss, analogous to adversarial DA.
Our experiments on the GLUE benchmark, SQuAD, and HellaSwag in three widely used training setups reveal that Glitter is substantially faster to train and achieves a competitive performance.
arXiv Detail & Related papers (2022-03-17T15:33:52Z) - Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated
Label Mixing [104.630875328668]
Mixup scheme suggests mixing a pair of samples to create an augmented training sample.
We present a novel, yet simple Mixup-variant that captures the best of both worlds.
arXiv Detail & Related papers (2021-12-16T11:27:48Z) - SelectAugment: Hierarchical Deterministic Sample Selection for Data
Augmentation [72.58308581812149]
We propose an effective approach, dubbed SelectAugment, to select samples to be augmented in a deterministic and online manner.
Specifically, in each batch, we first determine the augmentation ratio, and then decide whether to augment each training sample under this ratio.
In this way, the negative effects of the randomness in selecting samples to augment can be effectively alleviated and the effectiveness of DA is improved.
arXiv Detail & Related papers (2021-12-06T08:38:38Z) - On The Consistency Training for Open-Set Semi-Supervised Learning [44.046578996049654]
We study how OOD samples affect training in both low- and high-dimensional spaces.
Our method makes better use of OOD samples and achieves state-of-the-art results.
arXiv Detail & Related papers (2021-01-19T12:38:17Z) - Bridging In- and Out-of-distribution Samples for Their Better
Discriminability [18.84265231678354]
We consider samples lying in the intermediate of the two and use them for training a network.
We generate such samples using multiple image transformations that corrupt inputs in various ways and with different severity levels.
We estimate where the generated samples by a single image transformation lie between ID and OOD using a network trained on clean ID samples.
arXiv Detail & Related papers (2021-01-07T11:34:18Z) - One for More: Selecting Generalizable Samples for Generalizable ReID
Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function.
Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.