Provable Guarantees for Self-Supervised Deep Learning with Spectral
Contrastive Loss
- URL: http://arxiv.org/abs/2106.04156v1
- Date: Tue, 8 Jun 2021 07:41:02 GMT
- Title: Provable Guarantees for Self-Supervised Deep Learning with Spectral
Contrastive Loss
- Authors: Jeff Z. HaoChen, Colin Wei, Adrien Gaidon, Tengyu Ma
- Abstract summary: Recent works in self-supervised learning have advanced the state-of-the-art by relying on the contrastive learning paradigm.
Our work analyzes contrastive learning without assuming conditional independence of positive pairs.
We propose a loss that performs spectral decomposition on the population augmentation graph and can be succinctly written as a contrastive learning objective.
- Score: 72.62029620566925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent works in self-supervised learning have advanced the state-of-the-art
by relying on the contrastive learning paradigm, which learns representations
by pushing positive pairs, or similar examples from the same class, closer
together while keeping negative pairs far apart. Despite the empirical
successes, theoretical foundations are limited -- prior analyses assume
conditional independence of the positive pairs given the same class label, but
recent empirical applications use heavily correlated positive pairs (i.e., data
augmentations of the same image). Our work analyzes contrastive learning
without assuming conditional independence of positive pairs using a novel
concept of the augmentation graph on data. Edges in this graph connect
augmentations of the same data, and ground-truth classes naturally form
connected sub-graphs. We propose a loss that performs spectral decomposition on
the population augmentation graph and can be succinctly written as a
contrastive learning objective on neural net representations. Minimizing this
objective leads to features with provable accuracy guarantees under linear
probe evaluation. By standard generalization bounds, these accuracy guarantees
also hold when minimizing the training contrastive loss. Empirically, the
features learned by our objective can match or outperform several strong
baselines on benchmark vision datasets. In all, this work provides the first
provable analysis for contrastive learning where guarantees for linear probe
evaluation can apply to realistic empirical settings.
Related papers
- Rethinking Positive Pairs in Contrastive Learning [19.149235307036324]
We present Hydra, a universal contrastive learning framework for visual representations that extends conventional contrastive learning to accommodate arbitrary pairs.
Our approach is validated using IN1K, where 1K diverse classes compose 500,500 pairs, most of them being distinct.
Our work highlights the value of learning common features of arbitrary pairs and potentially broadens the applicability of contrastive learning techniques on the sample pairs with weak relationships.
arXiv Detail & Related papers (2024-10-23T18:07:18Z) - Bootstrap Latents of Nodes and Neighbors for Graph Self-Supervised Learning [27.278097015083343]
Contrastive learning requires negative samples to prevent model collapse and learn discriminative representations.
We introduce a cross-attention module to predict the supportiveness score of a neighbor with respect to the anchor node.
Our method mitigates class collision from negative and noisy positive samples, concurrently enhancing intra-class compactness.
arXiv Detail & Related papers (2024-08-09T14:17:52Z) - Pixel is All You Need: Adversarial Trajectory-Ensemble Active Learning
for Salient Object Detection [40.97103355628434]
It is unclear whether a saliency model trained with weakly-supervised data can achieve the equivalent performance of its fully-supervised version.
We propose a novel yet effective adversarial trajectory-ensemble active learning (ATAL)
Experimental results show that our ATAL can find such a point-labeled dataset, where a saliency model trained on it obtained $97%$ -- $99%$ performance of its fully-supervised version with only ten annotated points per image.
arXiv Detail & Related papers (2022-12-13T11:18:08Z) - Understanding Contrastive Learning Requires Incorporating Inductive
Biases [64.56006519908213]
Recent attempts to theoretically explain the success of contrastive learning on downstream tasks prove guarantees depending on properties of em augmentations and the value of em contrastive loss of representations.
We demonstrate that such analyses ignore em inductive biases of the function class and training algorithm, even em provably leading to vacuous guarantees in some settings.
arXiv Detail & Related papers (2022-02-28T18:59:20Z) - Masked prediction tasks: a parameter identifiability view [49.533046139235466]
We focus on the widely used self-supervised learning method of predicting masked tokens.
We show that there is a rich landscape of possibilities, out of which some prediction tasks yield identifiability, while others do not.
arXiv Detail & Related papers (2022-02-18T17:09:32Z) - A Low Rank Promoting Prior for Unsupervised Contrastive Learning [108.91406719395417]
We construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning.
Our hypothesis explicitly requires that all the samples belonging to the same instance class lie on the same subspace with small dimension.
Empirical evidences show that the proposed algorithm clearly surpasses the state-of-the-art approaches on multiple benchmarks.
arXiv Detail & Related papers (2021-08-05T15:58:25Z) - From Canonical Correlation Analysis to Self-supervised Graph Neural
Networks [99.44881722969046]
We introduce a conceptually simple yet effective model for self-supervised representation learning with graph data.
We optimize an innovative feature-level objective inspired by classical Canonical Correlation Analysis.
Our method performs competitively on seven public graph datasets.
arXiv Detail & Related papers (2021-06-23T15:55:47Z) - Incremental False Negative Detection for Contrastive Learning [95.68120675114878]
We introduce a novel incremental false negative detection for self-supervised contrastive learning.
During contrastive learning, we discuss two strategies to explicitly remove the detected false negatives.
Our proposed method outperforms other self-supervised contrastive learning frameworks on multiple benchmarks within a limited compute.
arXiv Detail & Related papers (2021-06-07T15:29:14Z) - SimCSE: Simple Contrastive Learning of Sentence Embeddings [10.33373737281907]
This paper presents SimCSE, a contrastive learning framework for embeddings.
We first describe an unsupervised approach, which takes an input sentence and predicts itself in a contrastive objective.
We then incorporate annotated pairs from NLI datasets into contrastive learning by using "entailment" pairs as positives and "contradiction" pairs as hard negatives.
arXiv Detail & Related papers (2021-04-18T11:27:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.