In What Ways Are Deep Neural Networks Invariant and How Should We
Measure This?
- URL: http://arxiv.org/abs/2210.03773v1
- Date: Fri, 7 Oct 2022 18:43:21 GMT
- Title: In What Ways Are Deep Neural Networks Invariant and How Should We
Measure This?
- Authors: Henry Kvinge, Tegan H. Emerson, Grayson Jorgenson, Scott Vasquez,
Timothy Doster, Jesse D. Lew
- Abstract summary: We introduce a family of invariance and equivariance metrics that allows us to quantify these properties in a way that disentangles them from other metrics such as loss or accuracy.
We draw a range of conclusions about invariance and equivariance in deep learning models, ranging from whether initializing a model with pretrained weights has an effect on a trained model's invariance, to the extent to which invariance learned via training can generalize to out-of-distribution data.
- Score: 5.757836174655293
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is often said that a deep learning model is "invariant" to some specific
type of transformation. However, what is meant by this statement strongly
depends on the context in which it is made. In this paper we explore the nature
of invariance and equivariance of deep learning models with the goal of better
understanding the ways in which they actually capture these concepts on a
formal level. We introduce a family of invariance and equivariance metrics that
allows us to quantify these properties in a way that disentangles them from
other metrics such as loss or accuracy. We use our metrics to better understand
the two most popular methods used to build invariance into networks: data
augmentation and equivariant layers. We draw a range of conclusions about
invariance and equivariance in deep learning models, ranging from whether
initializing a model with pretrained weights has an effect on a trained model's
invariance, to the extent to which invariance learned via training can
generalize to out-of-distribution data.
Related papers
- On genuine invariance learning without weight-tying [6.308539010172309]
We analyze invariance learning in neural networks without weight-tying constraints.
We show that learned invariance is strongly conditioned on the input data, rendering it unreliable if the input distribution shifts.
arXiv Detail & Related papers (2023-08-07T20:41:19Z) - What Affects Learned Equivariance in Deep Image Recognition Models? [10.590129221143222]
We find evidence for a correlation between learned translation equivariance and validation accuracy on ImageNet.
Data augmentation, reduced model capacity and inductive bias in the form of convolutions induce higher learned equivariance in neural networks.
arXiv Detail & Related papers (2023-04-05T17:54:25Z) - The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures.
We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities.
For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z) - On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data.
Invariance measures consistency of model predictions on transformations of the data.
From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Regularising for invariance to data augmentation improves supervised
learning [82.85692486314949]
We show that using multiple augmentations per input can improve generalisation.
We propose an explicit regulariser that encourages this invariance on the level of individual model predictions.
arXiv Detail & Related papers (2022-03-07T11:25:45Z) - Counterfactual Invariance to Spurious Correlations: Why and How to Pass
Stress Tests [87.60900567941428]
A spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter.
In machine learning, these have a know-it-when-you-see-it character.
We study stress testing using the tools of causal inference.
arXiv Detail & Related papers (2021-05-31T14:39:38Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z) - What causes the test error? Going beyond bias-variance via ANOVA [21.359033212191218]
Modern machine learning methods are often overparametrized, allowing adaptation to the data at a fine level.
Recent work aimed to understand in greater depth why overparametrization is helpful for generalization.
We propose using the analysis of variance (ANOVA) to decompose the variance in the test error in a symmetric way.
arXiv Detail & Related papers (2020-10-11T05:21:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.