CO2: Consistent Contrast for Unsupervised Visual Representation Learning
- URL: http://arxiv.org/abs/2010.02217v1
- Date: Mon, 5 Oct 2020 18:00:01 GMT
- Title: CO2: Consistent Contrast for Unsupervised Visual Representation Learning
- Authors: Chen Wei, Huiyu Wang, Wei Shen, Alan Yuille
- Abstract summary: We propose Consistent Contrast (CO2), which introduces a consistency regularization term into the current contrastive learning framework.
Regarding the similarity of the query crop to each crop from other images as "unlabeled", the consistency term takes the corresponding similarity of a positive crop as a pseudo label, and encourages consistency between these two similarities.
Empirically, CO2 improves Momentum Contrast (MoCo) by 2.9% top-1 accuracy on ImageNet linear protocol, 3.8% and 1.1% top-5 accuracy on 1% and 10% labeled semi-supervised settings.
- Score: 15.18275537384316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning has been adopted as a core method for unsupervised
visual representation learning. Without human annotation, the common practice
is to perform an instance discrimination task: Given a query image crop, this
task labels crops from the same image as positives, and crops from other
randomly sampled images as negatives. An important limitation of this label
assignment strategy is that it can not reflect the heterogeneous similarity
between the query crop and each crop from other images, taking them as equally
negative, while some of them may even belong to the same semantic class as the
query. To address this issue, inspired by consistency regularization in
semi-supervised learning on unlabeled data, we propose Consistent Contrast
(CO2), which introduces a consistency regularization term into the current
contrastive learning framework. Regarding the similarity of the query crop to
each crop from other images as "unlabeled", the consistency term takes the
corresponding similarity of a positive crop as a pseudo label, and encourages
consistency between these two similarities. Empirically, CO2 improves Momentum
Contrast (MoCo) by 2.9% top-1 accuracy on ImageNet linear protocol, 3.8% and
1.1% top-5 accuracy on 1% and 10% labeled semi-supervised settings. It also
transfers to image classification, object detection, and semantic segmentation
on PASCAL VOC. This shows that CO2 learns better visual representations for
these downstream tasks.
Related papers
- Learning to Rank Patches for Unbiased Image Redundancy Reduction [80.93989115541966]
Images suffer from heavy spatial redundancy because pixels in neighboring regions are spatially correlated.
Existing approaches strive to overcome this limitation by reducing less meaningful image regions.
We propose a self-supervised framework for image redundancy reduction called Learning to Rank Patches.
arXiv Detail & Related papers (2024-03-31T13:12:41Z) - Multi-View Correlation Consistency for Semi-Supervised Semantic
Segmentation [59.34619548026885]
Semi-supervised semantic segmentation needs rich and robust supervision on unlabeled data.
We propose a view-coherent data augmentation strategy that guarantees pixel-pixel correspondence between different views.
In a series of semi-supervised settings on two datasets, we report competitive accuracy compared with the state-of-the-art methods.
arXiv Detail & Related papers (2022-08-17T17:59:11Z) - Bootstrapping Semi-supervised Medical Image Segmentation with
Anatomical-aware Contrastive Distillation [10.877450596327407]
We present ACTION, an Anatomical-aware ConTrastive dIstillatiON framework, for semi-supervised medical image segmentation.
We first develop an iterative contrastive distillation algorithm by softly labeling the negatives rather than binary supervision between positive and negative pairs.
We also capture more semantically similar features from the randomly chosen negative set compared to the positives to enforce the diversity of the sampled data.
arXiv Detail & Related papers (2022-06-06T01:30:03Z) - Mix-up Self-Supervised Learning for Contrast-agnostic Applications [33.807005669824136]
We present the first mix-up self-supervised learning framework for contrast-agnostic applications.
We address the low variance across images based on cross-domain mix-up and build the pretext task based on image reconstruction and transparency prediction.
arXiv Detail & Related papers (2022-04-02T16:58:36Z) - A Theory-Driven Self-Labeling Refinement Method for Contrastive
Representation Learning [111.05365744744437]
Unsupervised contrastive learning labels crops of the same image as positives, and other image crops as negatives.
In this work, we first prove that for contrastive learning, inaccurate label assignment heavily impairs its generalization for semantic instance discrimination.
Inspired by this theory, we propose a novel self-labeling refinement approach for contrastive learning.
arXiv Detail & Related papers (2021-06-28T14:24:52Z) - Seed the Views: Hierarchical Semantic Alignment for Contrastive
Representation Learning [116.91819311885166]
We propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to textbfCross-samples and Multi-level representation.
Our method, termed as CsMl, has the ability to integrate multi-level visual representations across samples in a robust way.
arXiv Detail & Related papers (2020-12-04T17:26:24Z) - Debiased Contrastive Learning [64.98602526764599]
We develop a debiased contrastive objective that corrects for the sampling of same-label datapoints.
Empirically, the proposed objective consistently outperforms the state-of-the-art for representation learning in vision, language, and reinforcement learning benchmarks.
arXiv Detail & Related papers (2020-07-01T04:25:24Z) - Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation
Learning [108.999497144296]
Recently advanced unsupervised learning approaches use the siamese-like framework to compare two "views" from the same image for learning representations.
This work aims to involve the distance concept on label space in the unsupervised learning and let the model be aware of the soft degree of similarity between positive or negative pairs.
Despite its conceptual simplicity, we show empirically that with the solution -- Unsupervised image mixtures (Un-Mix), we can learn subtler, more robust and generalized representations from the transformed input and corresponding new label space.
arXiv Detail & Related papers (2020-03-11T17:59:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.