Solving Inefficiency of Self-supervised Representation Learning
- URL: http://arxiv.org/abs/2104.08760v1
- Date: Sun, 18 Apr 2021 07:47:10 GMT
- Title: Solving Inefficiency of Self-supervised Representation Learning
- Authors: Guangrun Wang, Keze Wang, Guangcong Wang, Phillip H.S. Torr, Liang Lin
- Abstract summary: Existing contrastive learning methods suffer from very low learning efficiency.
Under-clustering and over-clustering problems are major obstacles to learning efficiency.
We propose a novel self-supervised learning framework using a median triplet loss.
- Score: 87.30876679780532
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning has attracted great interest due to its tremendous
potentials in learning discriminative representations in an unsupervised
manner. Along this direction, contrastive learning achieves current
state-of-the-art performance. Despite the acknowledged successes, existing
contrastive learning methods suffer from very low learning efficiency, e.g.,
taking about ten times more training epochs than supervised learning for
comparable recognition accuracy. In this paper, we discover two contradictory
phenomena in contrastive learning that we call under-clustering and
over-clustering problems, which are major obstacles to learning efficiency.
Under-clustering means that the model cannot efficiently learn to discover the
dissimilarity between inter-class samples when the negative sample pairs for
contrastive learning are insufficient to differentiate all the actual object
categories. Over-clustering implies that the model cannot efficiently learn the
feature representation from excessive negative sample pairs, which include many
outliers and thus enforce the model to over-cluster samples of the same actual
categories into different clusters. To simultaneously overcome these two
problems, we propose a novel self-supervised learning framework using a median
triplet loss. Precisely, we employ a triplet loss tending to maximize the
relative distance between the positive pair and negative pairs to address the
under-clustering problem; and we construct the negative pair by selecting the
negative sample of a median similarity score from all negative samples to avoid
the over-clustering problem, guaranteed by the Bernoulli Distribution model. We
extensively evaluate our proposed framework in several large-scale benchmarks
(e.g., ImageNet, SYSU-30k, and COCO). The results demonstrate the superior
performance of our model over the latest state-of-the-art methods by a clear
margin.
Related papers
- Semantic Positive Pairs for Enhancing Visual Representation Learning of Instance Discrimination methods [4.680881326162484]
Self-supervised learning algorithms (SSL) based on instance discrimination have shown promising results.
We propose an approach to identify those images with similar semantic content and treat them as positive instances.
We run experiments on three benchmark datasets: ImageNet, STL-10 and CIFAR-10 with different instance discrimination SSL approaches.
arXiv Detail & Related papers (2023-06-28T11:47:08Z) - Cluster-aware Contrastive Learning for Unsupervised Out-of-distribution
Detection [0.0]
Unsupervised out-of-distribution (OOD) Detection aims to separate the samples falling outside the distribution of training data without label information.
We propose Cluster-aware Contrastive Learning (CCL) framework for unsupervised OOD detection, which considers both instance-level and semantic-level information.
arXiv Detail & Related papers (2023-02-06T07:21:03Z) - Rethinking Clustering-Based Pseudo-Labeling for Unsupervised
Meta-Learning [146.11600461034746]
Method for unsupervised meta-learning, CACTUs, is a clustering-based approach with pseudo-labeling.
This approach is model-agnostic and can be combined with supervised algorithms to learn from unlabeled data.
We prove that the core reason for this is lack of a clustering-friendly property in the embedding space.
arXiv Detail & Related papers (2022-09-27T19:04:36Z) - Joint Debiased Representation and Image Clustering Learning with
Self-Supervision [3.1806743741013657]
We develop a novel joint clustering and contrastive learning framework.
We adapt the debiased contrastive loss to avoid under-clustering minority classes of imbalanced datasets.
arXiv Detail & Related papers (2022-09-14T21:23:41Z) - Neighborhood Contrastive Learning for Novel Class Discovery [79.14767688903028]
We build a new framework, named Neighborhood Contrastive Learning, to learn discriminative representations that are important to clustering performance.
We experimentally demonstrate that these two ingredients significantly contribute to clustering performance and lead our model to outperform state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-06-20T17:34:55Z) - Incremental False Negative Detection for Contrastive Learning [95.68120675114878]
We introduce a novel incremental false negative detection for self-supervised contrastive learning.
During contrastive learning, we discuss two strategies to explicitly remove the detected false negatives.
Our proposed method outperforms other self-supervised contrastive learning frameworks on multiple benchmarks within a limited compute.
arXiv Detail & Related papers (2021-06-07T15:29:14Z) - Scalable Personalised Item Ranking through Parametric Density Estimation [53.44830012414444]
Learning from implicit feedback is challenging because of the difficult nature of the one-class problem.
Most conventional methods use a pairwise ranking approach and negative samplers to cope with the one-class problem.
We propose a learning-to-rank approach, which achieves convergence speed comparable to the pointwise counterpart.
arXiv Detail & Related papers (2021-05-11T03:38:16Z) - Doubly Contrastive Deep Clustering [135.7001508427597]
We present a novel Doubly Contrastive Deep Clustering (DCDC) framework, which constructs contrastive loss over both sample and class views.
Specifically, for the sample view, we set the class distribution of the original sample and its augmented version as positive sample pairs.
For the class view, we build the positive and negative pairs from the sample distribution of the class.
In this way, two contrastive losses successfully constrain the clustering results of mini-batch samples in both sample and class level.
arXiv Detail & Related papers (2021-03-09T15:15:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.