Delving into Inter-Image Invariance for Unsupervised Visual
Representations
- URL: http://arxiv.org/abs/2008.11702v3
- Date: Thu, 15 Sep 2022 17:28:35 GMT
- Title: Delving into Inter-Image Invariance for Unsupervised Visual
Representations
- Authors: Jiahao Xie, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy
- Abstract summary: We present a study to better understand the role of inter-image invariance learning.
Online labels converge faster than offline labels.
Semi-hard negative samples are more reliable and unbiased than hard negative samples.
- Score: 108.33534231219464
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning has recently shown immense potential in unsupervised
visual representation learning. Existing studies in this track mainly focus on
intra-image invariance learning. The learning typically uses rich intra-image
transformations to construct positive pairs and then maximizes agreement using
a contrastive loss. The merits of inter-image invariance, conversely, remain
much less explored. One major obstacle to exploit inter-image invariance is
that it is unclear how to reliably construct inter-image positive pairs, and
further derive effective supervision from them since no pair annotations are
available. In this work, we present a comprehensive empirical study to better
understand the role of inter-image invariance learning from three main
constituting components: pseudo-label maintenance, sampling strategy, and
decision boundary design. To facilitate the study, we introduce a unified and
generic framework that supports the integration of unsupervised intra- and
inter-image invariance learning. Through carefully-designed comparisons and
analysis, multiple valuable observations are revealed: 1) online labels
converge faster and perform better than offline labels; 2) semi-hard negative
samples are more reliable and unbiased than hard negative samples; 3) a less
stringent decision boundary is more favorable for inter-image invariance
learning. With all the obtained recipes, our final model, namely InterCLR,
shows consistent improvements over state-of-the-art intra-image invariance
learning methods on multiple standard benchmarks. We hope this work will
provide useful experience for devising effective unsupervised inter-image
invariance learning. Code: https://github.com/open-mmlab/mmselfsup.
Related papers
- Regularized Contrastive Partial Multi-view Outlier Detection [76.77036536484114]
We propose a novel method named Regularized Contrastive Partial Multi-view Outlier Detection (RCPMOD)
In this framework, we utilize contrastive learning to learn view-consistent information and distinguish outliers by the degree of consistency.
Experimental results on four benchmark datasets demonstrate that our proposed approach could outperform state-of-the-art competitors.
arXiv Detail & Related papers (2024-08-02T14:34:27Z) - Cross-Modal Contrastive Learning for Robust Reasoning in VQA [76.1596796687494]
Multi-modal reasoning in visual question answering (VQA) has witnessed rapid progress recently.
Most reasoning models heavily rely on shortcuts learned from training data.
We propose a simple but effective cross-modal contrastive learning strategy to get rid of the shortcut reasoning.
arXiv Detail & Related papers (2022-11-21T05:32:24Z) - Non-contrastive representation learning for intervals from well logs [58.70164460091879]
The representation learning problem in the oil & gas industry aims to construct a model that provides a representation based on logging data for a well interval.
One of the possible approaches is self-supervised learning (SSL)
We are the first to introduce non-contrastive SSL for well-logging data.
arXiv Detail & Related papers (2022-09-28T13:27:10Z) - Exploring Negatives in Contrastive Learning for Unpaired Image-to-Image
Translation [12.754320302262533]
We introduce a new negative Pruning technology for Unpaired image-to-image Translation (PUT) by sparsifying and ranking the patches.
The proposed algorithm is efficient, flexible and enables the model to learn essential information between corresponding patches stably.
arXiv Detail & Related papers (2022-04-23T08:31:18Z) - Mix-up Self-Supervised Learning for Contrast-agnostic Applications [33.807005669824136]
We present the first mix-up self-supervised learning framework for contrast-agnostic applications.
We address the low variance across images based on cross-domain mix-up and build the pretext task based on image reconstruction and transparency prediction.
arXiv Detail & Related papers (2022-04-02T16:58:36Z) - ISD: Self-Supervised Learning by Iterative Similarity Distillation [39.60300771234578]
We introduce a self supervised learning algorithm where we use a soft similarity for the negative images rather than a binary distinction between positive and negative pairs.
Our method achieves better results compared to state-of-the-art models like BYOL and MoCo on transfer learning settings.
arXiv Detail & Related papers (2020-12-16T20:50:17Z) - Hard Negative Mixing for Contrastive Learning [29.91220669060252]
We argue that an important aspect of contrastive learning, i.e., the effect of hard negatives, has so far been neglected.
We propose hard negative mixing strategies at the feature level, that can be computed on-the-fly with a minimal computational overhead.
arXiv Detail & Related papers (2020-10-02T14:34:58Z) - Contrastive Learning for Unpaired Image-to-Image Translation [64.47477071705866]
In image-to-image translation, each patch in the output should reflect the content of the corresponding patch in the input, independent of domain.
We propose a framework based on contrastive learning to maximize mutual information between the two.
We demonstrate that our framework enables one-sided translation in the unpaired image-to-image translation setting, while improving quality and reducing training time.
arXiv Detail & Related papers (2020-07-30T17:59:58Z) - Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation
Learning [108.999497144296]
Recently advanced unsupervised learning approaches use the siamese-like framework to compare two "views" from the same image for learning representations.
This work aims to involve the distance concept on label space in the unsupervised learning and let the model be aware of the soft degree of similarity between positive or negative pairs.
Despite its conceptual simplicity, we show empirically that with the solution -- Unsupervised image mixtures (Un-Mix), we can learn subtler, more robust and generalized representations from the transformed input and corresponding new label space.
arXiv Detail & Related papers (2020-03-11T17:59:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.