Related papers: ReSmooth: Detecting and Utilizing OOD Samples when Training with Data Augmentation

ReSmooth: Detecting and Utilizing OOD Samples when Training with Data Augmentation

URL: http://arxiv.org/abs/2205.12606v1
Date: Wed, 25 May 2022 09:29:27 GMT
Title: ReSmooth: Detecting and Utilizing OOD Samples when Training with Data Augmentation
Authors: Chenyang Wang, Junjun Jiang, Xiong Zhou, Xianming Liu
Abstract summary: Recent DA techniques always meet the need for diversity in augmented training samples. An augmentation strategy that has a high diversity usually introduces out-of-distribution (OOD) augmented samples. We propose ReSmooth, a framework that firstly detects OOD samples in augmented samples and then leverages them.
Score: 57.38418881020046
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Data augmentation (DA) is a widely used technique for enhancing the training of deep neural networks. Recent DA techniques which achieve state-of-the-art performance always meet the need for diversity in augmented training samples. However, an augmentation strategy that has a high diversity usually introduces out-of-distribution (OOD) augmented samples and these samples consequently impair the performance. To alleviate this issue, we propose ReSmooth, a framework that firstly detects OOD samples in augmented samples and then leverages them. To be specific, we first use a Gaussian mixture model to fit the loss distribution of both the original and augmented samples and accordingly split these samples into in-distribution (ID) samples and OOD samples. Then we start a new training where ID and OOD samples are incorporated with different smooth labels. By treating ID samples and OOD samples unequally, we can make better use of the diverse augmented data. Further, we incorporate our ReSmooth framework with negative data augmentation strategies. By properly handling their intentionally created ODD samples, the classification performance of negative data augmentations is largely ameliorated. Experiments on several classification benchmarks show that ReSmooth can be easily extended to existing augmentation strategies (such as RandAugment, rotate, and jigsaw) and improve on them.

Related papers

ScoreMix: A Scalable Augmentation Strategy for Training GANs with Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z)
Towards Robust Visual Question Answering: Making the Most of Biased Samples via Contrastive Learning [54.61762276179205]
We propose a novel contrastive learning approach, MMBS, for building robust VQA models by Making the Most of Biased Samples. Specifically, we construct positive samples for contrastive learning by eliminating the information related to spurious correlation from the original training samples. We validate our contributions by achieving competitive performance on the OOD dataset VQA-CP v2 while preserving robust performance on the ID dataset VQA v2.
arXiv Detail & Related papers (2022-10-10T11:05:21Z)
When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation [19.569164094496955]
We present a universal Data Augmentation (DA) technique, called Glitter, to overcome both issues. Glitter adaptively selects a subset of worst-case samples with maximal loss, analogous to adversarial DA. Our experiments on the GLUE benchmark, SQuAD, and HellaSwag in three widely used training setups reveal that Glitter is substantially faster to train and achieves a competitive performance.
arXiv Detail & Related papers (2022-03-17T15:33:52Z)
Saliency Grafting: Innocuous Attribution-Guided Mixup with Calibrated Label Mixing [104.630875328668]
Mixup scheme suggests mixing a pair of samples to create an augmented training sample. We present a novel, yet simple Mixup-variant that captures the best of both worlds.
arXiv Detail & Related papers (2021-12-16T11:27:48Z)
SelectAugment: Hierarchical Deterministic Sample Selection for Data Augmentation [72.58308581812149]
We propose an effective approach, dubbed SelectAugment, to select samples to be augmented in a deterministic and online manner. Specifically, in each batch, we first determine the augmentation ratio, and then decide whether to augment each training sample under this ratio. In this way, the negative effects of the randomness in selecting samples to augment can be effectively alleviated and the effectiveness of DA is improved.
arXiv Detail & Related papers (2021-12-06T08:38:38Z)
On The Consistency Training for Open-Set Semi-Supervised Learning [44.046578996049654]
We study how OOD samples affect training in both low- and high-dimensional spaces. Our method makes better use of OOD samples and achieves state-of-the-art results.
arXiv Detail & Related papers (2021-01-19T12:38:17Z)
Bridging In- and Out-of-distribution Samples for Their Better Discriminability [18.84265231678354]
We consider samples lying in the intermediate of the two and use them for training a network. We generate such samples using multiple image transformations that corrupt inputs in various ways and with different severity levels. We estimate where the generated samples by a single image transformation lie between ID and OOD using a network trained on clean ID samples.
arXiv Detail & Related papers (2021-01-07T11:34:18Z)
One for More: Selecting Generalizable Samples for Generalizable ReID Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function. Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.