Towards the Sparseness of Projection Head in Self-Supervised Learning
- URL: http://arxiv.org/abs/2307.08913v2
- Date: Wed, 19 Jul 2023 14:18:00 GMT
- Title: Towards the Sparseness of Projection Head in Self-Supervised Learning
- Authors: Zeen Song, Xingzhe Su, Jingyao Wang, Wenwen Qiang, Changwen Zheng,
Fuchun Sun
- Abstract summary: We provide insights into the internal mechanisms of the projection head and its relationship with the phenomenon of dimensional collapse.
We introduce SparseHead - a regularization term that effectively constrains the sparsity of the projection head, and can be seamlessly integrated with any self-supervised learning (SSL) approaches.
- Score: 13.308675583018756
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, self-supervised learning (SSL) has emerged as a promising
approach for extracting valuable representations from unlabeled data. One
successful SSL method is contrastive learning, which aims to bring positive
examples closer while pushing negative examples apart. Many current contrastive
learning approaches utilize a parameterized projection head. Through a
combination of empirical analysis and theoretical investigation, we provide
insights into the internal mechanisms of the projection head and its
relationship with the phenomenon of dimensional collapse. Our findings
demonstrate that the projection head enhances the quality of representations by
performing contrastive loss in a projected subspace. Therefore, we propose an
assumption that only a subset of features is necessary when minimizing the
contrastive loss of a mini-batch of data. Theoretical analysis further suggests
that a sparse projection head can enhance generalization, leading us to
introduce SparseHead - a regularization term that effectively constrains the
sparsity of the projection head, and can be seamlessly integrated with any
self-supervised learning (SSL) approaches. Our experimental results validate
the effectiveness of SparseHead, demonstrating its ability to improve the
performance of existing contrastive methods.
Related papers
- Preventing Collapse in Contrastive Learning with Orthonormal Prototypes (CLOP) [0.0]
CLOP is a novel semi-supervised loss function designed to prevent neural collapse by promoting the formation of linear subspaces among class embeddings.
We show that CLOP enhances performance, providing greater stability across different learning rates and batch sizes.
arXiv Detail & Related papers (2024-03-27T15:48:16Z) - The Common Stability Mechanism behind most Self-Supervised Learning
Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques.
We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO.
We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z) - A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - Understanding and Improving the Role of Projection Head in
Self-Supervised Learning [77.59320917894043]
Self-supervised learning (SSL) aims to produce useful feature representations without access to human-labeled data annotations.
Current contrastive learning approaches append a parametrized projection head to the end of some backbone network to optimize the InfoNCE objective.
This raises a fundamental question: Why is a learnable projection head required if we are to discard it after training?
arXiv Detail & Related papers (2022-12-22T05:42:54Z) - Rethinking Prototypical Contrastive Learning through Alignment,
Uniformity and Correlation [24.794022951873156]
We propose to learn Prototypical representation through Alignment, Uniformity and Correlation (PAUC)
Specifically, the ordinary ProtoNCE loss is revised with: (1) an alignment loss that pulls embeddings from positive prototypes together; (2) a loss that distributes the prototypical level features uniformly; (3) a correlation loss that increases the diversity and discriminability between prototypical level features.
arXiv Detail & Related papers (2022-10-18T22:33:12Z) - A Low Rank Promoting Prior for Unsupervised Contrastive Learning [108.91406719395417]
We construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning.
Our hypothesis explicitly requires that all the samples belonging to the same instance class lie on the same subspace with small dimension.
Empirical evidences show that the proposed algorithm clearly surpasses the state-of-the-art approaches on multiple benchmarks.
arXiv Detail & Related papers (2021-08-05T15:58:25Z) - Deep Dimension Reduction for Supervised Representation Learning [51.10448064423656]
We propose a deep dimension reduction approach to learning representations with essential characteristics.
The proposed approach is a nonparametric generalization of the sufficient dimension reduction method.
We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero.
arXiv Detail & Related papers (2020-06-10T14:47:43Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.