Related papers: Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks

Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks

URL: http://arxiv.org/abs/2111.11398v1
Date: Mon, 22 Nov 2021 18:16:35 GMT
Title: Why Do Self-Supervised Models Transfer? Investigating the Impact of Invariance on Downstream Tasks
Authors: Linus Ericsson and Henry Gouk and Timothy M. Hospedales
Abstract summary: Self-supervised learning is a powerful paradigm for representation learning on unlabelled images. We show that different tasks in computer vision require features to encode different (in)variances.
Score: 79.13089902898848
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised learning is a powerful paradigm for representation learning on unlabelled images. A wealth of effective new methods based on instance matching rely on data augmentation to drive learning, and these have reached a rough agreement on an augmentation scheme that optimises popular recognition benchmarks. However, there is strong reason to suspect that different tasks in computer vision require features to encode different (in)variances, and therefore likely require different augmentation strategies. In this paper, we measure the invariances learned by contrastive methods and confirm that they do learn invariance to the augmentations used and further show that this invariance largely transfers to related real-world changes in pose and lighting. We show that learned invariances strongly affect downstream task performance and confirm that different downstream tasks benefit from polar opposite (in)variances, leading to performance loss when the standard augmentation strategy is used. Finally, we demonstrate that a simple fusion of representations with complementary invariances ensures wide transferability to all the diverse downstream tasks considered.

Related papers

Understanding the Role of Invariance in Transfer Learning [9.220104991339104]
Transfer learning is a powerful technique for knowledge-sharing between different tasks. Recent work has found that the representations of models with certain invariances, such as to adversarial input perturbations, achieve higher performance on downstream tasks.
arXiv Detail & Related papers (2024-07-05T07:53:52Z)
Amortised Invariance Learning for Contrastive Self-Supervision [11.042648980854485]
We introduce the notion of amortised invariance learning for contrastive self supervision. We show that our amortised features provide a reliable way to learn diverse downstream tasks with different invariance requirements. This provides an exciting perspective that opens up new horizons in the field of general purpose representation learning.
arXiv Detail & Related papers (2023-02-24T16:15:11Z)
EquiMod: An Equivariance Module to Improve Self-Supervised Learning [77.34726150561087]
Self-supervised visual representation methods are closing the gap with supervised learning performance. These methods rely on maximizing the similarity between embeddings of related synthetic inputs created through data augmentations. We introduce EquiMod a generic equivariance module that structures the learned latent space.
arXiv Detail & Related papers (2022-11-02T16:25:54Z)
R\'enyiCL: Contrastive Representation Learning with Skew R\'enyi Divergence [78.15455360335925]
We present a new robust contrastive learning scheme, coined R'enyiCL, which can effectively manage harder augmentations. Our method is built upon the variational lower bound of R'enyi divergence. We show that R'enyi contrastive learning objectives perform innate hard negative sampling and easy positive sampling simultaneously.
arXiv Detail & Related papers (2022-08-12T13:37:05Z)
Rethinking the Augmentation Module in Contrastive Learning: Learning Hierarchical Augmentation Invariance with Expanded Views [22.47152165975219]
A data augmentation module is utilized in contrastive learning to transform the given data example into two views. This paper proposes a general method to alleviate these two problems by considering where and what to contrast in a general contrastive learning framework.
arXiv Detail & Related papers (2022-06-01T04:30:46Z)
Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems. We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z)
Improving Transferability of Representations via Augmentation-Aware Self-Supervision [117.15012005163322]
AugSelf is an auxiliary self-supervised loss that learns the difference of augmentation parameters between two randomly augmented samples. Our intuition is that AugSelf encourages to preserve augmentation-aware information in learned representations, which could be beneficial for their transferability. AugSelf can easily be incorporated into recent state-of-the-art representation learning methods with a negligible additional training cost.
arXiv Detail & Related papers (2021-11-18T10:43:50Z)
Improving Transformation Invariance in Contrastive Representation Learning [31.223892428863238]
We introduce a training objective for contrastive learning that uses a novel regularizer to control how the representation changes under transformation. Second, we propose a change to how test time representations are generated by introducing a feature averaging approach that combines encodings from multiple transformations of the original input. Third, we introduce the novel Spirograph dataset to explore our ideas in the context of a differentiable generative process with multiple downstream tasks.
arXiv Detail & Related papers (2020-10-19T13:49:29Z)
What Should Not Be Contrastive in Contrastive Learning [110.14159883496859]
We introduce a contrastive learning framework which does not require prior knowledge of specific, task-dependent invariances. Our model learns to capture varying and invariant factors for visual representations by constructing separate embedding spaces. We use a multi-head network with a shared backbone which captures information across each augmentation and alone outperforms all baselines on downstream tasks.
arXiv Detail & Related papers (2020-08-13T03:02:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.