EquiMod: An Equivariance Module to Improve Self-Supervised Learning
- URL: http://arxiv.org/abs/2211.01244v2
- Date: Thu, 8 Jun 2023 14:31:31 GMT
- Title: EquiMod: An Equivariance Module to Improve Self-Supervised Learning
- Authors: Alexandre Devillers and Mathieu Lefort
- Abstract summary: Self-supervised visual representation methods are closing the gap with supervised learning performance.
These methods rely on maximizing the similarity between embeddings of related synthetic inputs created through data augmentations.
We introduce EquiMod a generic equivariance module that structures the learned latent space.
- Score: 77.34726150561087
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised visual representation methods are closing the gap with
supervised learning performance. These methods rely on maximizing the
similarity between embeddings of related synthetic inputs created through data
augmentations. This can be seen as a task that encourages embeddings to leave
out factors modified by these augmentations, i.e. to be invariant to them.
However, this only considers one side of the trade-off in the choice of the
augmentations: they need to strongly modify the images to avoid simple solution
shortcut learning (e.g. using only color histograms), but on the other hand,
augmentations-related information may be lacking in the representations for
some downstream tasks (e.g. color is important for birds and flower
classification). Few recent works proposed to mitigate the problem of using
only an invariance task by exploring some form of equivariance to
augmentations. This has been performed by learning additional embeddings
space(s), where some augmentation(s) cause embeddings to differ, yet in a
non-controlled way. In this work, we introduce EquiMod a generic equivariance
module that structures the learned latent space, in the sense that our module
learns to predict the displacement in the embedding space caused by the
augmentations. We show that applying that module to state-of-the-art invariance
models, such as SimCLR and BYOL, increases the performances on CIFAR10 and
ImageNet datasets. Moreover, while our model could collapse to a trivial
equivariance, i.e. invariance, we observe that it instead automatically learns
to keep some augmentations-related information beneficial to the
representations.
Related papers
- Exploring Data Augmentations on Self-/Semi-/Fully- Supervised
Pre-trained Models [24.376036129920948]
We investigate how data augmentation affects performance of vision pre-trained models.
We apply 4 types of data augmentations termed with Random Erasing, CutOut, CutMix and MixUp.
We report their performance on vision tasks such as image classification, object detection, instance segmentation, and semantic segmentation.
arXiv Detail & Related papers (2023-10-28T23:46:31Z) - Rethinking the Augmentation Module in Contrastive Learning: Learning
Hierarchical Augmentation Invariance with Expanded Views [22.47152165975219]
A data augmentation module is utilized in contrastive learning to transform the given data example into two views.
This paper proposes a general method to alleviate these two problems by considering where and what to contrast in a general contrastive learning framework.
arXiv Detail & Related papers (2022-06-01T04:30:46Z) - Learning Instance-Specific Augmentations by Capturing Local Invariances [62.70897571389785]
InstaAug is a method for automatically learning input-specific augmentations from data.
We empirically demonstrate that InstaAug learns meaningful input-dependent augmentations for a wide range of transformation classes.
arXiv Detail & Related papers (2022-05-31T18:38:06Z) - Regularising for invariance to data augmentation improves supervised
learning [82.85692486314949]
We show that using multiple augmentations per input can improve generalisation.
We propose an explicit regulariser that encourages this invariance on the level of individual model predictions.
arXiv Detail & Related papers (2022-03-07T11:25:45Z) - Why Do Self-Supervised Models Transfer? Investigating the Impact of
Invariance on Downstream Tasks [79.13089902898848]
Self-supervised learning is a powerful paradigm for representation learning on unlabelled images.
We show that different tasks in computer vision require features to encode different (in)variances.
arXiv Detail & Related papers (2021-11-22T18:16:35Z) - Improving Transferability of Representations via Augmentation-Aware
Self-Supervision [117.15012005163322]
AugSelf is an auxiliary self-supervised loss that learns the difference of augmentation parameters between two randomly augmented samples.
Our intuition is that AugSelf encourages to preserve augmentation-aware information in learned representations, which could be beneficial for their transferability.
AugSelf can easily be incorporated into recent state-of-the-art representation learning methods with a negligible additional training cost.
arXiv Detail & Related papers (2021-11-18T10:43:50Z) - What Should Not Be Contrastive in Contrastive Learning [110.14159883496859]
We introduce a contrastive learning framework which does not require prior knowledge of specific, task-dependent invariances.
Our model learns to capture varying and invariant factors for visual representations by constructing separate embedding spaces.
We use a multi-head network with a shared backbone which captures information across each augmentation and alone outperforms all baselines on downstream tasks.
arXiv Detail & Related papers (2020-08-13T03:02:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.