Which Features are Learnt by Contrastive Learning? On the Role of
Simplicity Bias in Class Collapse and Feature Suppression
- URL: http://arxiv.org/abs/2305.16536v2
- Date: Mon, 29 May 2023 00:40:01 GMT
- Title: Which Features are Learnt by Contrastive Learning? On the Role of
Simplicity Bias in Class Collapse and Feature Suppression
- Authors: Yihao Xue, Siddharth Joshi, Eric Gan, Pin-Yu Chen, Baharan
Mirzasoleiman
- Abstract summary: Contrastive learning (CL) has emerged as a powerful technique for representation learning, with or without label supervision.
We provide the first unified theoretically rigorous framework to determine textitwhich features are learnt by CL.
We present increasing embedding dimensionality and improving the quality of data augmentations as two theoretically motivated solutions.
- Score: 59.97965005675144
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Contrastive learning (CL) has emerged as a powerful technique for
representation learning, with or without label supervision. However, supervised
CL is prone to collapsing representations of subclasses within a class by not
capturing all their features, and unsupervised CL may suppress harder
class-relevant features by focusing on learning easy class-irrelevant features;
both significantly compromise representation quality. Yet, there is no
theoretical understanding of \textit{class collapse} or \textit{feature
suppression} at \textit{test} time. We provide the first unified theoretically
rigorous framework to determine \textit{which} features are learnt by CL. Our
analysis indicate that, perhaps surprisingly, bias of (stochastic) gradient
descent towards finding simpler solutions is a key factor in collapsing
subclass representations and suppressing harder class-relevant features.
Moreover, we present increasing embedding dimensionality and improving the
quality of data augmentations as two theoretically motivated solutions to
{feature suppression}. We also provide the first theoretical explanation for
why employing supervised and unsupervised CL together yields higher-quality
representations, even when using commonly-used stochastic gradient methods.
Related papers
- Fine-Grained Representation Learning via Multi-Level Contrastive Learning without Class Priors [3.050634053489509]
Contrastive Disentangling (CD) is a framework designed to learn representations without relying on class priors.
CD integrates instance-level and feature-level contrastive losses with a normalized entropy loss to capture semantically rich and fine-grained representations.
arXiv Detail & Related papers (2024-09-07T16:39:14Z) - Subclass-balancing Contrastive Learning for Long-tailed Recognition [38.31221755013738]
Long-tailed recognition with imbalanced class distribution naturally emerges in practical machine learning applications.
We propose a novel subclass-balancing contrastive learning'' approach that clusters each head class into multiple subclasses of similar sizes as the tail classes.
We evaluate SBCL over a list of long-tailed benchmark datasets and it achieves the state-of-the-art performance.
arXiv Detail & Related papers (2023-06-28T05:08:43Z) - Triplet Contrastive Learning for Unsupervised Vehicle Re-identification [55.445358749042384]
Part feature learning is a critical technology for fine semantic understanding in vehicle re-identification.
We propose a novel Triplet Contrastive Learning framework (TCL) which leverages cluster features to bridge the part features and global features.
arXiv Detail & Related papers (2023-01-23T15:52:12Z) - Alignment-Uniformity aware Representation Learning for Zero-shot Video
Classification [3.6954802719347413]
This paper presents an end-to-end framework that preserves alignment and uniformity properties for representations on both seen and unseen classes.
Experiments show that our method significantly outperforms SoTA by relative improvements of 28.1% on UCF101 and 27.0% on HMDB51.
arXiv Detail & Related papers (2022-03-29T09:21:22Z) - Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new.
In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Exploiting a Joint Embedding Space for Generalized Zero-Shot Semantic
Segmentation [25.070027668717422]
Generalized zero-shot semantic segmentation (GZS3) predicts pixel-wise semantic labels for seen and unseen classes.
Most GZS3 methods adopt a generative approach that synthesizes visual features of unseen classes from corresponding semantic ones.
We propose a discriminative approach to address limitations in a unified framework.
arXiv Detail & Related papers (2021-08-14T13:33:58Z) - Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$.
Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.