The Penalty Imposed by Ablated Data Augmentation
- URL: http://arxiv.org/abs/2006.04769v1
- Date: Mon, 8 Jun 2020 17:38:21 GMT
- Title: The Penalty Imposed by Ablated Data Augmentation
- Authors: Frederick Liu, Amir Najmi, Mukund Sundararajan
- Abstract summary: We study a formal model of mean ablated data augmentation and inverted dropout for linear regression.
We prove that ablated data augmentation is equivalent to optimizing the ordinary least squares objective along with a penalty.
- Score: 17.639472693362926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a set of data augmentation techniques that ablate parts of the input
at random. These include input dropout, cutout, and random erasing. We term
these techniques ablated data augmentation. Though these techniques seems
similar in spirit and have shown success in improving model performance in a
variety of domains, we do not yet have a mathematical understanding of the
differences between these techniques like we do for other regularization
techniques like L1 or L2. First, we study a formal model of mean ablated data
augmentation and inverted dropout for linear regression. We prove that ablated
data augmentation is equivalent to optimizing the ordinary least squares
objective along with a penalty that we call the Contribution Covariance Penalty
and inverted dropout, a more common implementation than dropout in popular
frameworks, is equivalent to optimizing the ordinary least squares objective
along with Modified L2. For deep networks, we demonstrate an empirical version
of the result if we replace contributions with attributions and coefficients
with average gradients, i.e., the Contribution Covariance Penalty and Modified
L2 Penalty drop with the increase of the corresponding ablated data
augmentation across a variety of networks.
Related papers
- DualAug: Exploiting Additional Heavy Augmentation with OOD Data
Rejection [77.6648187359111]
We propose a novel data augmentation method, named textbfDualAug, to keep the augmentation in distribution as much as possible at a reasonable time and computational cost.
Experiments on supervised image classification benchmarks show that DualAug improve various automated data augmentation method.
arXiv Detail & Related papers (2023-10-12T08:55:10Z) - Implicit Counterfactual Data Augmentation for Robust Learning [24.795542869249154]
This study proposes an Implicit Counterfactual Data Augmentation method to remove spurious correlations and make stable predictions.
Experiments have been conducted across various biased learning scenarios covering both image and text datasets.
arXiv Detail & Related papers (2023-04-26T10:36:40Z) - How Much Data Are Augmentations Worth? An Investigation into Scaling
Laws, Invariance, and Implicit Regularization [76.58017437197859]
We find that in out-of-distribution testing scenarios, augmentations which yield samples that are diverse, but inconsistent with the data distribution can be even more valuable than additional training data.
We show that augmentations induce additionality during training, effectively flattening the loss landscape.
arXiv Detail & Related papers (2022-10-12T17:42:01Z) - The Power and Limitation of Pretraining-Finetuning for Linear Regression
under Covariate Shift [127.21287240963859]
We investigate a transfer learning approach with pretraining on the source data and finetuning based on the target data.
For a large class of linear regression instances, transfer learning with $O(N2)$ source data is as effective as supervised learning with $N$ target data.
arXiv Detail & Related papers (2022-08-03T05:59:49Z) - Rethinking the Augmentation Module in Contrastive Learning: Learning
Hierarchical Augmentation Invariance with Expanded Views [22.47152165975219]
A data augmentation module is utilized in contrastive learning to transform the given data example into two views.
This paper proposes a general method to alleviate these two problems by considering where and what to contrast in a general contrastive learning framework.
arXiv Detail & Related papers (2022-06-01T04:30:46Z) - Counterfactual Data Augmentation improves Factuality of Abstractive
Summarization [6.745946263790011]
We show that augmenting the training data with our approach improves the factual correctness of summaries without significantly affecting the ROUGE score.
We show that in two commonly used summarization datasets (CNN/Dailymail and XSum), we improve the factual correctness by about 2.5 points on average.
arXiv Detail & Related papers (2022-05-25T00:00:35Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Advanced Dropout: A Model-free Methodology for Bayesian Dropout
Optimization [62.8384110757689]
Overfitting ubiquitously exists in real-world applications of deep neural networks (DNNs)
The advanced dropout technique applies a model-free and easily implemented distribution with parametric prior, and adaptively adjusts dropout rate.
We evaluate the effectiveness of the advanced dropout against nine dropout techniques on seven computer vision datasets.
arXiv Detail & Related papers (2020-10-11T13:19:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.