MSR: Making Self-supervised learning Robust to Aggressive Augmentations
- URL: http://arxiv.org/abs/2206.01999v1
- Date: Sat, 4 Jun 2022 14:27:29 GMT
- Title: MSR: Making Self-supervised learning Robust to Aggressive Augmentations
- Authors: Yingbin Bai, Erkun Yang, Zhaoqing Wang, Yuxuan Du, Bo Han, Cheng Deng,
Dadong Wang, Tongliang Liu
- Abstract summary: We propose a new SSL paradigm, which counteracts the impact of semantic shift by balancing the role of weak and aggressively augmented pairs.
We show that our model achieves 73.1% top-1 accuracy on ImageNet-1K with ResNet-50 for 200 epochs, which is a 2.5% improvement over BYOL.
- Score: 98.6457801252358
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most recent self-supervised learning methods learn visual representation by
contrasting different augmented views of images. Compared with supervised
learning, more aggressive augmentations have been introduced to further improve
the diversity of training pairs. However, aggressive augmentations may distort
images' structures leading to a severe semantic shift problem that augmented
views of the same image may not share the same semantics, thus degrading the
transfer performance. To address this problem, we propose a new SSL paradigm,
which counteracts the impact of semantic shift by balancing the role of weak
and aggressively augmented pairs. Specifically, semantically inconsistent pairs
are of minority and we treat them as noisy pairs. Note that deep neural
networks (DNNs) have a crucial memorization effect that DNNs tend to first
memorize clean (majority) examples before overfitting to noisy (minority)
examples. Therefore, we set a relatively large weight for aggressively
augmented data pairs at the early learning stage. With the training going on,
the model begins to overfit noisy pairs. Accordingly, we gradually reduce the
weights of aggressively augmented pairs. In doing so, our method can better
embrace the aggressive augmentations and neutralize the semantic shift problem.
Experiments show that our model achieves 73.1% top-1 accuracy on ImageNet-1K
with ResNet-50 for 200 epochs, which is a 2.5% improvement over BYOL. Moreover,
experiments also demonstrate that the learned representations can transfer well
for various downstream tasks.
Related papers
- Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning [12.5354658533836]
Humans possess remarkable ability to accurately classify new, unseen images after being exposed to only a few examples.
For artificial neural network models, determining the most relevant features for distinguishing between two images with limited samples presents a challenge.
We propose an intra-task mutual attention method for few-shot learning, that involves splitting the support and query samples into patches.
arXiv Detail & Related papers (2024-05-06T02:02:57Z) - Feature Dropout: Revisiting the Role of Augmentations in Contrastive
Learning [7.6834562879925885]
Recent work suggests that good augmentations are label-preserving with respect to a specific downstream task.
We show that label-destroying augmentations can be useful in the foundation model setting.
arXiv Detail & Related papers (2022-12-16T10:08:38Z) - Soft Augmentation for Image Classification [68.71067594724663]
We propose generalizing augmentation with invariant transforms to soft augmentation.
We show that soft targets allow for more aggressive data augmentation.
We also show that soft augmentations generalize to self-supervised classification tasks.
arXiv Detail & Related papers (2022-11-09T01:04:06Z) - Improving Transferability of Representations via Augmentation-Aware
Self-Supervision [117.15012005163322]
AugSelf is an auxiliary self-supervised loss that learns the difference of augmentation parameters between two randomly augmented samples.
Our intuition is that AugSelf encourages to preserve augmentation-aware information in learned representations, which could be beneficial for their transferability.
AugSelf can easily be incorporated into recent state-of-the-art representation learning methods with a negligible additional training cost.
arXiv Detail & Related papers (2021-11-18T10:43:50Z) - Augmentation Pathways Network for Visual Recognition [61.33084317147437]
This paper introduces Augmentation Pathways (AP) to stabilize training on a much wider range of augmentation policies.
AP tames heavy data augmentations and stably boosts performance without a careful selection among augmentation policies.
Experimental results on ImageNet benchmarks demonstrate the compatibility and effectiveness on a much wider range of augmentations.
arXiv Detail & Related papers (2021-07-26T06:54:53Z) - Contrastive Learning with Stronger Augmentations [63.42057690741711]
We propose a general framework called Contrastive Learning with Stronger Augmentations(A) to complement current contrastive learning approaches.
Here, the distribution divergence between the weakly and strongly augmented images over the representation bank is adopted to supervise the retrieval of strongly augmented queries.
Experiments showed the information from the strongly augmented images can significantly boost the performance.
arXiv Detail & Related papers (2021-04-15T18:40:04Z) - Leveraging background augmentations to encourage semantic focus in
self-supervised contrastive learning [16.93045612956149]
"Background augmentations" encourage models to focus on semantically-relevant content by discouraging them from focusing on image backgrounds.
Background augmentations lead to substantial improvements (+1-2% on ImageNet-1k) in performance across a spectrum of state-of-the art self-supervised methods.
arXiv Detail & Related papers (2021-03-23T17:39:16Z) - Understanding self-supervised Learning Dynamics without Contrastive
Pairs [72.1743263777693]
Contrastive approaches to self-supervised learning (SSL) learn representations by minimizing the distance between two augmented views of the same data point.
BYOL and SimSiam, show remarkable performance it without negative pairs.
We study the nonlinear learning dynamics of non-contrastive SSL in simple linear networks.
arXiv Detail & Related papers (2021-02-12T22:57:28Z) - Hard Negative Mixing for Contrastive Learning [29.91220669060252]
We argue that an important aspect of contrastive learning, i.e., the effect of hard negatives, has so far been neglected.
We propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead.
arXiv Detail & Related papers (2020-10-02T14:34:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.