Contrasting the landscape of contrastive and non-contrastive learning
- URL: http://arxiv.org/abs/2203.15702v1
- Date: Tue, 29 Mar 2022 16:08:31 GMT
- Title: Contrasting the landscape of contrastive and non-contrastive learning
- Authors: Ashwini Pokle, Jinjin Tian, Yuchen Li, Andrej Risteski
- Abstract summary: We show that even on simple data models, non-contrastive losses have a preponderance of non-collapsed bad minima.
We show that the training process does not avoid these minima.
- Score: 25.76544128487728
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A lot of recent advances in unsupervised feature learning are based on
designing features which are invariant under semantic data augmentations. A
common way to do this is contrastive learning, which uses positive and negative
samples. Some recent works however have shown promising results for
non-contrastive learning, which does not require negative samples. However, the
non-contrastive losses have obvious "collapsed" minima, in which the encoders
output a constant feature embedding, independent of the input. A folk
conjecture is that so long as these collapsed solutions are avoided, the
produced feature representations should be good. In our paper, we cast doubt on
this story: we show through theoretical results and controlled experiments that
even on simple data models, non-contrastive losses have a preponderance of
non-collapsed bad minima. Moreover, we show that the training process does not
avoid these minima.
Related papers
- Understanding Collapse in Non-Contrastive Learning [122.2499276246997]
We show that SimSiam representations undergo partial dimensional collapse if the model is too small relative to the dataset size.
We propose a metric to measure the degree of this collapse and show that it can be used to forecast the downstream task performance without any fine-tuning or labels.
arXiv Detail & Related papers (2022-09-29T17:59:55Z) - Exploring the Impact of Negative Samples of Contrastive Learning: A Case
Study of Sentence Embedding [14.295787044482136]
We present a momentum contrastive learning model with negative sample queue for sentence embedding, namely MoCoSE.
We define a maximum traceable distance metric, through which we learn to what extent the text contrastive learning benefits from the historical information of negative samples.
Our experiments find that the best results are obtained when the maximum traceable distance is at a certain range, demonstrating that there is an optimal range of historical information for a negative sample queue.
arXiv Detail & Related papers (2022-02-26T08:29:25Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - Dense Out-of-Distribution Detection by Robust Learning on Synthetic
Negative Data [1.7474352892977458]
We show how to detect out-of-distribution anomalies in road-driving scenes and remote sensing imagery.
We leverage a jointly trained normalizing flow due to coverage-oriented learning objective and the capability to generate samples at different resolutions.
The resulting models set the new state of the art on benchmarks for out-of-distribution detection in road-driving scenes and remote sensing imagery.
arXiv Detail & Related papers (2021-12-23T20:35:10Z) - Investigating the Role of Negatives in Contrastive Representation
Learning [59.30700308648194]
Noise contrastive learning is a popular technique for unsupervised representation learning.
We focus on disambiguating the role of one of these parameters: the number of negative examples.
We find that the results broadly agree with our theory, while our vision experiments are murkier with performance sometimes even being insensitive to the number of negatives.
arXiv Detail & Related papers (2021-06-18T06:44:16Z) - Incremental False Negative Detection for Contrastive Learning [95.68120675114878]
We introduce a novel incremental false negative detection for self-supervised contrastive learning.
During contrastive learning, we discuss two strategies to explicitly remove the detected false negatives.
Our proposed method outperforms other self-supervised contrastive learning frameworks on multiple benchmarks within a limited compute.
arXiv Detail & Related papers (2021-06-07T15:29:14Z) - A Sober Look at the Unsupervised Learning of Disentangled
Representations and their Evaluation [63.042651834453544]
We show that the unsupervised learning of disentangled representations is impossible without inductive biases on both the models and the data.
We observe that while the different methods successfully enforce properties "encouraged" by the corresponding losses, well-disentangled models seemingly cannot be identified without supervision.
Our results suggest that future work on disentanglement learning should be explicit about the role of inductive biases and (implicit) supervision.
arXiv Detail & Related papers (2020-10-27T10:17:15Z) - Contrastive Learning with Hard Negative Samples [80.12117639845678]
We develop a new family of unsupervised sampling methods for selecting hard negative samples.
A limiting case of this sampling results in a representation that tightly clusters each class, and pushes different classes as far apart as possible.
The proposed method improves downstream performance across multiple modalities, requires only few additional lines of code to implement, and introduces no computational overhead.
arXiv Detail & Related papers (2020-10-09T14:18:53Z) - Collective Loss Function for Positive and Unlabeled Learning [19.058269616452545]
We propose a Collectively loss function to learn from only Positive and Unlabeled data.
Results show that cPU consistently outperforms the current state-of-the-art PU learning methods.
arXiv Detail & Related papers (2020-05-06T03:30:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.