Related papers: Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

URL: http://arxiv.org/abs/2103.08933v1
Date: Tue, 16 Mar 2021 09:31:04 GMT
Title: Reweighting Augmented Samples by Minimizing the Maximal Expected Loss
Authors: Mingyang Yi, Lu Hou, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma
Abstract summary: We construct the maximal expected loss which is the supremum over any reweighted loss on augmented samples. Inspired by adversarial training, we minimize this maximal expected loss and obtain a simple and interpretable closed-form solution. The proposed method can generally be applied on top of any data augmentation methods.
Score: 51.2791895511333
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Data augmentation is an effective technique to improve the generalization of deep neural networks. However, previous data augmentation methods usually treat the augmented samples equally without considering their individual impacts on the model. To address this, for the augmented samples from the same training example, we propose to assign different weights to them. We construct the maximal expected loss which is the supremum over any reweighted loss on augmented samples. Inspired by adversarial training, we minimize this maximal expected loss (MMEL) and obtain a simple and interpretable closed-form solution: more attention should be paid to augmented samples with large loss values (i.e., harder examples). Minimizing this maximal expected loss enables the model to perform well under any reweighting strategy. The proposed method can generally be applied on top of any data augmentation methods. Experiments are conducted on both natural language understanding tasks with token-level data augmentation, and image classification tasks with commonly-used image augmentation techniques like random crop and horizontal flip. Empirical results show that the proposed method improves the generalization performance of the model.

Related papers

Learning Augmentation Policies from A Model Zoo for Time Series Forecasting [58.66211334969299]
We introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning. By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance.
arXiv Detail & Related papers (2024-09-10T07:34:19Z)
Data Pruning via Moving-one-Sample-out [61.45441981346064]
We propose a novel data-pruning approach called moving-one-sample-out (MoSo) MoSo aims to identify and remove the least informative samples from the training set. Experimental results demonstrate that MoSo effectively mitigates severe performance degradation at high pruning ratios.
arXiv Detail & Related papers (2023-10-23T08:00:03Z)
ScoreMix: A Scalable Augmentation Strategy for Training GANs with Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z)
DiscrimLoss: A Universal Loss for Hard Samples and Incorrect Samples Discrimination [28.599571524763785]
Given data with label noise (i.e., incorrect data), deep neural networks would gradually memorize the label noise and impair model performance. To relieve this issue, curriculum learning is proposed to improve model performance and generalization by ordering training samples in a meaningful sequence.
arXiv Detail & Related papers (2022-08-21T13:38:55Z)
ReSmooth: Detecting and Utilizing OOD Samples when Training with Data Augmentation [57.38418881020046]
Recent DA techniques always meet the need for diversity in augmented training samples. An augmentation strategy that has a high diversity usually introduces out-of-distribution (OOD) augmented samples. We propose ReSmooth, a framework that firstly detects OOD samples in augmented samples and then leverages them.
arXiv Detail & Related papers (2022-05-25T09:29:27Z)
Missingness Augmentation: A General Approach for Improving Generative Imputation Models [20.245637164975594]
We propose a novel data augmentation method called Missingness Augmentation (MisA) for generative imputation models. As a general augmentation technique, MisA can be easily integrated into generative imputation frameworks. Experimental results demonstrate that MisA significantly improves the performance of many recently proposed generative imputation models.
arXiv Detail & Related papers (2021-07-31T08:51:46Z)
Not Far Away, Not So Close: Sample Efficient Nearest Neighbour Data Augmentation via MiniMax [7.680863481076596]
MiniMax-kNN is a sample efficient data augmentation strategy. We exploit a semi-supervised approach based on knowledge distillation to train a model on augmented data.
arXiv Detail & Related papers (2021-05-28T06:32:32Z)
SapAugment: Learning A Sample Adaptive Policy for Data Augmentation [21.044266725115577]
We propose a novel method to learn a Sample-Adaptive Policy for Augmentation -- SapAugment. We show substantial improvement, up to 21% relative reduction in word error rate on LibriSpeech dataset, over the state-of-the-art speech augmentation method.
arXiv Detail & Related papers (2020-11-02T17:52:26Z)
Self-paced Data Augmentation for Training Neural Networks [11.554821454921536]
We propose a self-paced augmentation to automatically select suitable samples for data augmentation when training a neural network. The proposed method mitigates the deterioration of generalization performance caused by ineffective data augmentation. Experimental results demonstrate that the proposed SPA can improve the generalization performance, particularly when the number of training samples is small.
arXiv Detail & Related papers (2020-10-29T09:13:18Z)
Advanced Dropout: A Model-free Methodology for Bayesian Dropout Optimization [62.8384110757689]
Overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs) The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate. We evaluate the effectiveness of the advanced dropout against nine dropout techniques on seven computer vision datasets.
arXiv Detail & Related papers (2020-10-11T13:19:58Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.