Understanding Contrastive Representation Learning through Alignment and
Uniformity on the Hypersphere
- URL: http://arxiv.org/abs/2005.10242v10
- Date: Tue, 10 Nov 2020 07:05:17 GMT
- Title: Understanding Contrastive Representation Learning through Alignment and
Uniformity on the Hypersphere
- Authors: Tongzhou Wang, Phillip Isola
- Abstract summary: We identify two key properties related to the contrastive loss.
We prove that the contrastive loss optimize these properties, and analyze their positive effects on downstream tasks.
Remarkably, directly optimizing for these two metrics leads to representations with comparable or better performance at downstream tasks than contrastive learning.
- Score: 32.0469500831667
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive representation learning has been outstandingly successful in
practice. In this work, we identify two key properties related to the
contrastive loss: (1) alignment (closeness) of features from positive pairs,
and (2) uniformity of the induced distribution of the (normalized) features on
the hypersphere. We prove that, asymptotically, the contrastive loss optimizes
these properties, and analyze their positive effects on downstream tasks.
Empirically, we introduce an optimizable metric to quantify each property.
Extensive experiments on standard vision and language datasets confirm the
strong agreement between both metrics and downstream task performance.
Remarkably, directly optimizing for these two metrics leads to representations
with comparable or better performance at downstream tasks than contrastive
learning.
Project Page: https://tongzhouwang.info/hypersphere
Code: https://github.com/SsnL/align_uniform ,
https://github.com/SsnL/moco_align_uniform
Related papers
- Decoupled Contrastive Learning for Long-Tailed Recognition [58.255966442426484]
Supervised Contrastive Loss (SCL) is popular in visual representation learning.
In the scenario of long-tailed recognition, where the number of samples in each class is imbalanced, treating two types of positive samples equally leads to the biased optimization for intra-category distance.
We propose a patch-based self distillation to transfer knowledge from head to tail classes to relieve the under-representation of tail classes.
arXiv Detail & Related papers (2024-03-10T09:46:28Z) - Perfect Alignment May be Poisonous to Graph Contrastive Learning [15.668610380413682]
Graph Contrastive Learning (GCL) aims to learn node representations by aligning positive pairs and separating negative ones.
This paper seeks to establish a connection between augmentation and downstream performance.
arXiv Detail & Related papers (2023-10-06T02:22:49Z) - Hodge-Aware Contrastive Learning [101.56637264703058]
Simplicial complexes prove effective in modeling data with multiway dependencies.
We develop a contrastive self-supervised learning approach for processing simplicial data.
arXiv Detail & Related papers (2023-09-14T00:40:07Z) - Single-Pass Contrastive Learning Can Work for Both Homophilic and
Heterophilic Graph [60.28340453547902]
Graph contrastive learning (GCL) techniques typically require two forward passes for a single instance to construct the contrastive loss.
Existing GCL approaches fail to provide strong performance guarantees.
We implement the Single-Pass Graph Contrastive Learning method (SP-GCL)
Empirically, the features learned by the SP-GCL can match or outperform existing strong baselines with significantly less computational overhead.
arXiv Detail & Related papers (2022-11-20T07:18:56Z) - Generate, Discriminate and Contrast: A Semi-Supervised Sentence
Representation Learning Framework [68.04940365847543]
We propose a semi-supervised sentence embedding framework, GenSE, that effectively leverages large-scale unlabeled data.
Our method include three parts: 1) Generate: A generator/discriminator model is jointly trained to synthesize sentence pairs from open-domain unlabeled corpus; 2) Discriminate: Noisy sentence pairs are filtered out by the discriminator to acquire high-quality positive and negative sentence pairs; 3) Contrast: A prompt-based contrastive approach is presented for sentence representation learning with both annotated and synthesized data.
arXiv Detail & Related papers (2022-10-30T10:15:21Z) - Correlation between Alignment-Uniformity and Performance of Dense
Contrastive Representations [11.266613717084788]
We analyze the theoretical ideas of dense contrastive learning using a standard CNN and straightforward feature matching scheme.
We discover the core principle in constructing a positive pair of dense features and empirically proved its validity.
Also, we introduce a new scalar metric that summarizes the correlation between alignment-and-uniformity and downstream performance.
arXiv Detail & Related papers (2022-10-17T08:08:37Z) - Non-contrastive representation learning for intervals from well logs [58.70164460091879]
The representation learning problem in the oil & gas industry aims to construct a model that provides a representation based on logging data for a well interval.
One of the possible approaches is self-supervised learning (SSL)
We are the first to introduce non-contrastive SSL for well-logging data.
arXiv Detail & Related papers (2022-09-28T13:27:10Z) - Positive-Negative Equal Contrastive Loss for Semantic Segmentation [8.664491798389662]
Previous works commonly design plug-and-play modules and structural losses to effectively extract and aggregate the global context.
We propose Positive-Negative Equal contrastive loss (PNE loss), which increases the latent impact of positive embedding on the anchor and treats the positive as well as negative sample pairs equally.
We conduct comprehensive experiments and achieve state-of-the-art performance on two benchmark datasets.
arXiv Detail & Related papers (2022-07-04T13:51:29Z) - Improving Contrastive Learning by Visualizing Feature Transformation [37.548120912055595]
In this paper, we attempt to devise a feature-level data manipulation, differing from data augmentation, to enhance the generic contrastive self-supervised learning.
We first design a visualization scheme for pos/neg score (Pos/neg score indicates similarity of pos/neg pair.) distribution, which enables us to analyze, interpret and understand the learning process.
Experiment results show that our proposed Feature Transformation can improve at least 6.0% accuracy on ImageNet-100 over MoCo baseline, and about 2.0% accuracy on ImageNet-1K over the MoCoV2 baseline.
arXiv Detail & Related papers (2021-08-06T07:26:08Z) - Provable Guarantees for Self-Supervised Deep Learning with Spectral
Contrastive Loss [72.62029620566925]
Recent works in self-supervised learning have advanced the state-of-the-art by relying on the contrastive learning paradigm.
Our work analyzes contrastive learning without assuming conditional independence of positive pairs.
We propose a loss that performs spectral decomposition on the population augmentation graph and can be succinctly written as a contrastive learning objective.
arXiv Detail & Related papers (2021-06-08T07:41:02Z) - Dissecting Supervised Constrastive Learning [24.984074794337157]
Minimizing cross-entropy over the softmax scores of a linear map composed with a high-capacity encoder is arguably the most popular choice for training neural networks on supervised learning tasks.
We show that one can directly optimize the encoder instead, to obtain equally (or even more) discriminative representations via a supervised variant of a contrastive objective.
arXiv Detail & Related papers (2021-02-17T15:22:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.