Regularising for invariance to data augmentation improves supervised
learning
- URL: http://arxiv.org/abs/2203.03304v1
- Date: Mon, 7 Mar 2022 11:25:45 GMT
- Title: Regularising for invariance to data augmentation improves supervised
learning
- Authors: Aleksander Botev, Matthias Bauer, Soham De
- Abstract summary: We show that using multiple augmentations per input can improve generalisation.
We propose an explicit regulariser that encourages this invariance on the level of individual model predictions.
- Score: 82.85692486314949
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data augmentation is used in machine learning to make the classifier
invariant to label-preserving transformations. Usually this invariance is only
encouraged implicitly by including a single augmented input during training.
However, several works have recently shown that using multiple augmentations
per input can improve generalisation or can be used to incorporate invariances
more explicitly. In this work, we first empirically compare these recently
proposed objectives that differ in whether they rely on explicit or implicit
regularisation and at what level of the predictor they encode the invariances.
We show that the predictions of the best performing method are also the most
similar when compared on different augmentations of the same input. Inspired by
this observation, we propose an explicit regulariser that encourages this
invariance on the level of individual model predictions. Through extensive
experiments on CIFAR-100 and ImageNet we show that this explicit regulariser
(i) improves generalisation and (ii) equalises performance differences between
all considered objectives. Our results suggest that objectives that encourage
invariance on the level of the neural network itself generalise better than
those that achieve invariance by averaging predictions of non-invariant models.
Related papers
- The good, the bad and the ugly sides of data augmentation: An implicit
spectral regularization perspective [14.229855423083922]
Data augmentation (DA) is a powerful workhorse for bolstering performance in modern machine learning.
In this work, we develop a new theoretical framework to characterize the impact of a general class of DA on generalization.
Our framework highlights the nuanced and sometimes surprising impacts of DA on generalization, and serves as a testbed for novel augmentation design.
arXiv Detail & Related papers (2022-10-10T21:30:46Z) - On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data.
Invariance measures consistency of model predictions on transformations of the data.
From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - More Is More -- Narrowing the Generalization Gap by Adding
Classification Heads [8.883733362171032]
We introduce an architecture enhancement for existing neural network models based on input transformations, termed 'TransNet'
Our model can be employed during training time only and then pruned for prediction, resulting in an equivalent architecture to the base model.
arXiv Detail & Related papers (2021-02-09T16:30:33Z) - Squared $\ell_2$ Norm as Consistency Loss for Leveraging Augmented Data
to Learn Robust and Invariant Representations [76.85274970052762]
Regularizing distance between embeddings/representations of original samples and augmented counterparts is a popular technique for improving robustness of neural networks.
In this paper, we explore these various regularization choices, seeking to provide a general understanding of how we should regularize the embeddings.
We show that the generic approach we identified (squared $ell$ regularized augmentation) outperforms several recent methods, which are each specially designed for one task.
arXiv Detail & Related papers (2020-11-25T22:40:09Z) - What causes the test error? Going beyond bias-variance via ANOVA [21.359033212191218]
Modern machine learning methods are often overparametrized, allowing adaptation to the data at a fine level.
Recent work aimed to understand in greater depth why overparametrization is helpful for generalization.
We propose using the analysis of variance (ANOVA) to decompose the variance in the test error in a symmetric way.
arXiv Detail & Related papers (2020-10-11T05:21:13Z) - Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples.
This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network.
Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.