Understanding self-supervised Learning Dynamics without Contrastive
Pairs
- URL: http://arxiv.org/abs/2102.06810v1
- Date: Fri, 12 Feb 2021 22:57:28 GMT
- Title: Understanding self-supervised Learning Dynamics without Contrastive
Pairs
- Authors: Yuandong Tian and Xinlei Chen and Surya Ganguli
- Abstract summary: Contrastive approaches to self-supervised learning (SSL) learn representations by minimizing the distance between two augmented views of the same data point.
BYOL and SimSiam, show remarkable performance it without negative pairs.
We study the nonlinear learning dynamics of non-contrastive SSL in simple linear networks.
- Score: 72.1743263777693
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Contrastive approaches to self-supervised learning (SSL) learn
representations by minimizing the distance between two augmented views of the
same data point (positive pairs) and maximizing the same from different data
points (negative pairs). However, recent approaches like BYOL and SimSiam, show
remarkable performance {\it without} negative pairs, raising a fundamental
theoretical question: how can SSL with only positive pairs avoid
representational collapse? We study the nonlinear learning dynamics of
non-contrastive SSL in simple linear networks. Our analysis yields conceptual
insights into how non-contrastive SSL methods learn, how they avoid
representational collapse, and how multiple factors, like predictor networks,
stop-gradients, exponential moving averages, and weight decay all come into
play. Our simple theory recapitulates the results of real-world ablation
studies in both STL-10 and ImageNet. Furthermore, motivated by our theory we
propose a novel approach that \emph{directly} sets the predictor based on the
statistics of its inputs. In the case of linear predictors, our approach
outperforms gradient training of the predictor by $5\%$ and on ImageNet it
performs comparably with more complex two-layer non-linear predictors that
employ BatchNorm. Code is released in
https://github.com/facebookresearch/luckmatters/tree/master/ssl.
Related papers
- Understanding Representation Learnability of Nonlinear Self-Supervised
Learning [13.965135660149212]
Self-supervised learning (SSL) has empirically shown its data representation learnability in many downstream tasks.
Our paper is the first to analyze the learning results of the nonlinear SSL model accurately.
arXiv Detail & Related papers (2024-01-06T13:23:26Z) - Non-contrastive representation learning for intervals from well logs [58.70164460091879]
The representation learning problem in the oil & gas industry aims to construct a model that provides a representation based on logging data for a well interval.
One of the possible approaches is self-supervised learning (SSL)
We are the first to introduce non-contrastive SSL for well-logging data.
arXiv Detail & Related papers (2022-09-28T13:27:10Z) - Siamese Prototypical Contrastive Learning [24.794022951873156]
Contrastive Self-supervised Learning (CSL) is a practical solution that learns meaningful visual representations from massive data in an unsupervised approach.
In this paper, we tackle this problem by introducing a simple but effective contrastive learning framework.
The key insight is to employ siamese-style metric loss to match intra-prototype features, while increasing the distance between inter-prototype features.
arXiv Detail & Related papers (2022-08-18T13:25:30Z) - Chaos is a Ladder: A New Theoretical Understanding of Contrastive
Learning via Augmentation Overlap [64.60460828425502]
We propose a new guarantee on the downstream performance of contrastive learning.
Our new theory hinges on the insight that the support of different intra-class samples will become more overlapped under aggressive data augmentations.
We propose an unsupervised model selection metric ARC that aligns well with downstream accuracy.
arXiv Detail & Related papers (2022-03-25T05:36:26Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z) - Whitening for Self-Supervised Representation Learning [129.57407186848917]
We propose a new loss function for self-supervised representation learning (SSL) based on the whitening of latent-space features.
Our solution does not require asymmetric networks and it is conceptually simple.
arXiv Detail & Related papers (2020-07-13T12:33:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.