Learning Invariances in Neural Networks
- URL: http://arxiv.org/abs/2010.11882v2
- Date: Tue, 1 Dec 2020 17:38:11 GMT
- Title: Learning Invariances in Neural Networks
- Authors: Gregory Benton, Marc Finzi, Pavel Izmailov, Andrew Gordon Wilson
- Abstract summary: We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
- Score: 51.20867785006147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Invariances to translations have imbued convolutional neural networks with
powerful generalization properties. However, we often do not know a priori what
invariances are present in the data, or to what extent a model should be
invariant to a given symmetry group. We show how to \emph{learn} invariances
and equivariances by parameterizing a distribution over augmentations and
optimizing the training loss simultaneously with respect to the network
parameters and augmentation parameters. With this simple procedure we can
recover the correct set and extent of invariances on image classification,
regression, segmentation, and molecular property prediction from a large space
of augmentations, on training data alone.
Related papers
- Symmetry Discovery for Different Data Types [52.2614860099811]
Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance.
We propose LieSD, a method for discovering symmetries via trained neural networks which approximate the input-output mappings of the tasks.
We validate the performance of LieSD on tasks with symmetries such as the two-body problem, the moment of inertia matrix prediction, and top quark tagging.
arXiv Detail & Related papers (2024-10-13T13:39:39Z) - Invariance Measures for Neural Networks [1.2845309023495566]
We propose measures to quantify the invariance of neural networks in terms of their internal representation.
The measures are efficient and interpretable, and can be applied to any neural network model.
arXiv Detail & Related papers (2023-10-26T13:59:39Z) - Learning Layer-wise Equivariances Automatically using Gradients [66.81218780702125]
Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance.
symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted.
Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients.
arXiv Detail & Related papers (2023-10-09T20:22:43Z) - On genuine invariance learning without weight-tying [6.308539010172309]
We analyze invariance learning in neural networks without weight-tying constraints.
We show that learned invariance is strongly conditioned on the input data, rendering it unreliable if the input distribution shifts.
arXiv Detail & Related papers (2023-08-07T20:41:19Z) - What Affects Learned Equivariance in Deep Image Recognition Models? [10.590129221143222]
We find evidence for a correlation between learned translation equivariance and validation accuracy on ImageNet.
Data augmentation, reduced model capacity and inductive bias in the form of convolutions induce higher learned equivariance in neural networks.
arXiv Detail & Related papers (2023-04-05T17:54:25Z) - The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures.
We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities.
For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Learning Invariant Weights in Neural Networks [16.127299898156203]
Many commonly used models in machine learning are constraint to respect certain symmetries in the data.
We propose a weight-space equivalent to this approach, by minimizing a lower bound on the marginal likelihood to learn invariances in neural networks.
arXiv Detail & Related papers (2022-02-25T00:17:09Z) - On the Benefits of Invariance in Neural Networks [56.362579457990094]
We show that training with data augmentation leads to better estimates of risk and thereof gradients, and we provide a PAC-Bayes generalization bound for models trained with data augmentation.
We also show that compared to data augmentation, feature averaging reduces generalization error when used with convex losses, and tightens PAC-Bayes bounds.
arXiv Detail & Related papers (2020-05-01T02:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.