Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation
Learning
- URL: http://arxiv.org/abs/2003.05438v5
- Date: Thu, 17 Feb 2022 14:56:37 GMT
- Title: Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation
Learning
- Authors: Zhiqiang Shen and Zechun Liu and Zhuang Liu and Marios Savvides and
Trevor Darrell and Eric Xing
- Abstract summary: Recently advanced unsupervised learning approaches use the siamese-like framework to compare two "views" from the same image for learning representations.
This work aims to involve the distance concept on label space in the unsupervised learning and let the model be aware of the soft degree of similarity between positive or negative pairs.
Despite its conceptual simplicity, we show empirically that with the solution -- Unsupervised image mixtures (Un-Mix), we can learn subtler, more robust and generalized representations from the transformed input and corresponding new label space.
- Score: 108.999497144296
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The recently advanced unsupervised learning approaches use the siamese-like
framework to compare two "views" from the same image for learning
representations. Making the two views distinctive is a core to guarantee that
unsupervised methods can learn meaningful information. However, such frameworks
are sometimes fragile on overfitting if the augmentations used for generating
two views are not strong enough, causing the over-confident issue on the
training data. This drawback hinders the model from learning subtle variance
and fine-grained information. To address this, in this work we aim to involve
the distance concept on label space in the unsupervised learning and let the
model be aware of the soft degree of similarity between positive or negative
pairs through mixing the input data space, to further work collaboratively for
the input and loss spaces. Despite its conceptual simplicity, we show
empirically that with the solution -- Unsupervised image mixtures (Un-Mix), we
can learn subtler, more robust and generalized representations from the
transformed input and corresponding new label space. Extensive experiments are
conducted on CIFAR-10, CIFAR-100, STL-10, Tiny ImageNet and standard ImageNet
with popular unsupervised methods SimCLR, BYOL, MoCo V1&V2, SwAV, etc. Our
proposed image mixture and label assignment strategy can obtain consistent
improvement by 1~3% following exactly the same hyperparameters and training
procedures of the base methods. Code is publicly available at
https://github.com/szq0214/Un-Mix.
Related papers
- $\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs [62.565573316667276]
We develop an objective that encodes how a sample relates to others.
We train vision models based on similarities in class or text caption descriptions.
Our objective appears to work particularly well in lower-data regimes, with gains over CLIP of $16.8%$ on ImageNet and $18.1%$ on ImageNet Real.
arXiv Detail & Related papers (2024-07-25T15:38:16Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - Inter-Instance Similarity Modeling for Contrastive Learning [22.56316444504397]
We propose a novel image mix method, PatchMix, for contrastive learning in Vision Transformer (ViT)
Compared to the existing sample mix methods, our PatchMix can flexibly and efficiently mix more than two images.
Our proposed method significantly outperforms the previous state-of-the-art on both ImageNet-1K and CIFAR datasets.
arXiv Detail & Related papers (2023-06-21T13:03:47Z) - Mix-up Self-Supervised Learning for Contrast-agnostic Applications [33.807005669824136]
We present the first mix-up self-supervised learning framework for contrast-agnostic applications.
We address the low variance across images based on cross-domain mix-up and build the pretext task based on image reconstruction and transparency prediction.
arXiv Detail & Related papers (2022-04-02T16:58:36Z) - Crafting Better Contrastive Views for Siamese Representation Learning [20.552194081238248]
We propose ContrastiveCrop, which could effectively generate better crops for Siamese representation learning.
A semantic-aware object localization strategy is proposed within the training process in a fully unsupervised manner.
As a plug-and-play and framework-agnostic module, ContrastiveCrop consistently improves SimCLR, MoCo, BYOL, SimSiam by 0.4% 2.0% classification accuracy.
arXiv Detail & Related papers (2022-02-07T15:09:00Z) - Weakly Supervised Contrastive Learning [68.47096022526927]
We introduce a weakly supervised contrastive learning framework (WCL) to tackle this issue.
WCL achieves 65% and 72% ImageNet Top-1 Accuracy using ResNet50, which is even higher than SimCLRv2 with ResNet101.
arXiv Detail & Related papers (2021-10-10T12:03:52Z) - AugNet: End-to-End Unsupervised Visual Representation Learning with
Image Augmentation [3.6790362352712873]
We propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures.
Our experiments demonstrate that the method is able to represent the image in low dimensional space.
Unlike many deep-learning-based image retrieval algorithms, our approach does not require access to external annotated datasets.
arXiv Detail & Related papers (2021-06-11T09:02:30Z) - Delving into Inter-Image Invariance for Unsupervised Visual
Representations [108.33534231219464]
We present a study to better understand the role of inter-image invariance learning.
Online labels converge faster than offline labels.
Semi-hard negative samples are more reliable and unbiased than hard negative samples.
arXiv Detail & Related papers (2020-08-26T17:44:23Z) - Unsupervised Learning of Visual Features by Contrasting Cluster
Assignments [57.33699905852397]
We propose an online algorithm, SwAV, that takes advantage of contrastive methods without requiring to compute pairwise comparisons.
Our method simultaneously clusters the data while enforcing consistency between cluster assignments.
Our method can be trained with large and small batches and can scale to unlimited amounts of data.
arXiv Detail & Related papers (2020-06-17T14:00:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.