Rethinking the Augmentation Module in Contrastive Learning: Learning
Hierarchical Augmentation Invariance with Expanded Views
- URL: http://arxiv.org/abs/2206.00227v1
- Date: Wed, 1 Jun 2022 04:30:46 GMT
- Title: Rethinking the Augmentation Module in Contrastive Learning: Learning
Hierarchical Augmentation Invariance with Expanded Views
- Authors: Junbo Zhang, Kaisheng Ma
- Abstract summary: A data augmentation module is utilized in contrastive learning to transform the given data example into two views.
This paper proposes a general method to alleviate these two problems by considering where and what to contrast in a general contrastive learning framework.
- Score: 22.47152165975219
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A data augmentation module is utilized in contrastive learning to transform
the given data example into two views, which is considered essential and
irreplaceable. However, the predetermined composition of multiple data
augmentations brings two drawbacks. First, the artificial choice of
augmentation types brings specific representational invariances to the model,
which have different degrees of positive and negative effects on different
downstream tasks. Treating each type of augmentation equally during training
makes the model learn non-optimal representations for various downstream tasks
and limits the flexibility to choose augmentation types beforehand. Second, the
strong data augmentations used in classic contrastive learning methods may
bring too much invariance in some cases, and fine-grained information that is
essential to some downstream tasks may be lost. This paper proposes a general
method to alleviate these two problems by considering where and what to
contrast in a general contrastive learning framework. We first propose to learn
different augmentation invariances at different depths of the model according
to the importance of each data augmentation instead of learning
representational invariances evenly in the backbone. We then propose to expand
the contrast content with augmentation embeddings to reduce the misleading
effects of strong data augmentations. Experiments based on several baseline
methods demonstrate that we learn better representations for various benchmarks
on classification, detection, and segmentation downstream tasks.
Related papers
- Exploring Data Augmentations on Self-/Semi-/Fully- Supervised
Pre-trained Models [24.376036129920948]
We investigate how data augmentation affects performance of vision pre-trained models.
We apply 4 types of data augmentations termed with Random Erasing, CutOut, CutMix and MixUp.
We report their performance on vision tasks such as image classification, object detection, instance segmentation, and semantic segmentation.
arXiv Detail & Related papers (2023-10-28T23:46:31Z) - Time Series Contrastive Learning with Information-Aware Augmentations [57.45139904366001]
A key component of contrastive learning is to select appropriate augmentations imposing some priors to construct feasible positive samples.
How to find the desired augmentations of time series data that are meaningful for given contrastive learning tasks and datasets remains an open question.
We propose a new contrastive learning approach with information-aware augmentations, InfoTS, that adaptively selects optimal augmentations for time series representation learning.
arXiv Detail & Related papers (2023-03-21T15:02:50Z) - Amortised Invariance Learning for Contrastive Self-Supervision [11.042648980854485]
We introduce the notion of amortised invariance learning for contrastive self supervision.
We show that our amortised features provide a reliable way to learn diverse downstream tasks with different invariance requirements.
This provides an exciting perspective that opens up new horizons in the field of general purpose representation learning.
arXiv Detail & Related papers (2023-02-24T16:15:11Z) - Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models.
Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples.
We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z) - Feature Dropout: Revisiting the Role of Augmentations in Contrastive
Learning [7.6834562879925885]
Recent work suggests that good augmentations are label-preserving with respect to a specific downstream task.
We show that label-destroying augmentations can be useful in the foundation model setting.
arXiv Detail & Related papers (2022-12-16T10:08:38Z) - EquiMod: An Equivariance Module to Improve Self-Supervised Learning [77.34726150561087]
Self-supervised visual representation methods are closing the gap with supervised learning performance.
These methods rely on maximizing the similarity between embeddings of related synthetic inputs created through data augmentations.
We introduce EquiMod a generic equivariance module that structures the learned latent space.
arXiv Detail & Related papers (2022-11-02T16:25:54Z) - Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems.
We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z) - Why Do Self-Supervised Models Transfer? Investigating the Impact of
Invariance on Downstream Tasks [79.13089902898848]
Self-supervised learning is a powerful paradigm for representation learning on unlabelled images.
We show that different tasks in computer vision require features to encode different (in)variances.
arXiv Detail & Related papers (2021-11-22T18:16:35Z) - What Should Not Be Contrastive in Contrastive Learning [110.14159883496859]
We introduce a contrastive learning framework which does not require prior knowledge of specific, task-dependent invariances.
Our model learns to capture varying and invariant factors for visual representations by constructing separate embedding spaces.
We use a multi-head network with a shared backbone which captures information across each augmentation and alone outperforms all baselines on downstream tasks.
arXiv Detail & Related papers (2020-08-13T03:02:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.