HyperInvariances: Amortizing Invariance Learning
- URL: http://arxiv.org/abs/2207.08304v1
- Date: Sun, 17 Jul 2022 21:40:37 GMT
- Title: HyperInvariances: Amortizing Invariance Learning
- Authors: Ruchika Chavhan, Henry Gouk, Jan St\"uhmer, Timothy Hospedales
- Abstract summary: Invariance learning is expensive and data intensive for popular neural architectures.
We introduce the notion of amortizing invariance learning.
This framework can identify appropriate invariances in different downstream tasks and lead to comparable or better test performance.
- Score: 10.189246340672245
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Providing invariances in a given learning task conveys a key inductive bias
that can lead to sample-efficient learning and good generalisation, if
correctly specified. However, the ideal invariances for many problems of
interest are often not known, which has led both to a body of engineering lore
as well as attempts to provide frameworks for invariance learning. However,
invariance learning is expensive and data intensive for popular neural
architectures. We introduce the notion of amortizing invariance learning. In an
up-front learning phase, we learn a low-dimensional manifold of feature
extractors spanning invariance to different transformations using a
hyper-network. Then, for any problem of interest, both model and invariance
learning are rapid and efficient by fitting a low-dimensional invariance
descriptor an output head. Empirically, this framework can identify appropriate
invariances in different downstream tasks and lead to comparable or better test
performance than conventional approaches. Our HyperInvariance framework is also
theoretically appealing as it enables generalisation-bounds that provide an
interesting new operating point in the trade-off between model fit and
complexity.
Related papers
- Relaxed Equivariance via Multitask Learning [7.905957228045955]
We introduce REMUL, a training procedure for approximating equivariance with multitask learning.
We show that unconstrained models can learn approximate symmetries by minimizing an additional simple equivariance loss.
Our method achieves competitive performance compared to equivariant baselines while being $10 times$ faster at inference and $2.5 times$ at training.
arXiv Detail & Related papers (2024-10-23T13:50:27Z) - Amortised Invariance Learning for Contrastive Self-Supervision [11.042648980854485]
We introduce the notion of amortised invariance learning for contrastive self supervision.
We show that our amortised features provide a reliable way to learn diverse downstream tasks with different invariance requirements.
This provides an exciting perspective that opens up new horizons in the field of general purpose representation learning.
arXiv Detail & Related papers (2023-02-24T16:15:11Z) - Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds.
Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z) - In What Ways Are Deep Neural Networks Invariant and How Should We
Measure This? [5.757836174655293]
We introduce a family of invariance and equivariance metrics that allows us to quantify these properties in a way that disentangles them from other metrics such as loss or accuracy.
We draw a range of conclusions about invariance and equivariance in deep learning models, ranging from whether initializing a model with pretrained weights has an effect on a trained model's invariance, to the extent to which invariance learned via training can generalize to out-of-distribution data.
arXiv Detail & Related papers (2022-10-07T18:43:21Z) - Equivariance and Invariance Inductive Bias for Learning from
Insufficient Data [65.42329520528223]
We show why insufficient data renders the model more easily biased to the limited training environments that are usually different from testing.
We propose a class-wise invariant risk minimization (IRM) that efficiently tackles the challenge of missing environmental annotation in conventional IRM.
arXiv Detail & Related papers (2022-07-25T15:26:19Z) - Regularising for invariance to data augmentation improves supervised
learning [82.85692486314949]
We show that using multiple augmentations per input can improve generalisation.
We propose an explicit regulariser that encourages this invariance on the level of individual model predictions.
arXiv Detail & Related papers (2022-03-07T11:25:45Z) - Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems.
We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z) - Why Do Self-Supervised Models Transfer? Investigating the Impact of
Invariance on Downstream Tasks [79.13089902898848]
Self-supervised learning is a powerful paradigm for representation learning on unlabelled images.
We show that different tasks in computer vision require features to encode different (in)variances.
arXiv Detail & Related papers (2021-11-22T18:16:35Z) - What causes the test error? Going beyond bias-variance via ANOVA [21.359033212191218]
Modern machine learning methods are often overparametrized, allowing adaptation to the data at a fine level.
Recent work aimed to understand in greater depth why overparametrization is helpful for generalization.
We propose using the analysis of variance (ANOVA) to decompose the variance in the test error in a symmetric way.
arXiv Detail & Related papers (2020-10-11T05:21:13Z) - What Should Not Be Contrastive in Contrastive Learning [110.14159883496859]
We introduce a contrastive learning framework which does not require prior knowledge of specific, task-dependent invariances.
Our model learns to capture varying and invariant factors for visual representations by constructing separate embedding spaces.
We use a multi-head network with a shared backbone which captures information across each augmentation and alone outperforms all baselines on downstream tasks.
arXiv Detail & Related papers (2020-08-13T03:02:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.