Understanding and Improving the Role of Projection Head in
Self-Supervised Learning
- URL: http://arxiv.org/abs/2212.11491v1
- Date: Thu, 22 Dec 2022 05:42:54 GMT
- Title: Understanding and Improving the Role of Projection Head in
Self-Supervised Learning
- Authors: Kartik Gupta, Thalaiyasingam Ajanthan, Anton van den Hengel, Stephen
Gould
- Abstract summary: Self-supervised learning (SSL) aims to produce useful feature representations without access to human-labeled data annotations.
Current contrastive learning approaches append a parametrized projection head to the end of some backbone network to optimize the InfoNCE objective.
This raises a fundamental question: Why is a learnable projection head required if we are to discard it after training?
- Score: 77.59320917894043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning (SSL) aims to produce useful feature representations
without access to any human-labeled data annotations. Due to the success of
recent SSL methods based on contrastive learning, such as SimCLR, this problem
has gained popularity. Most current contrastive learning approaches append a
parametrized projection head to the end of some backbone network to optimize
the InfoNCE objective and then discard the learned projection head after
training. This raises a fundamental question: Why is a learnable projection
head required if we are to discard it after training? In this work, we first
perform a systematic study on the behavior of SSL training focusing on the role
of the projection head layers. By formulating the projection head as a
parametric component for the InfoNCE objective rather than a part of the
network, we present an alternative optimization scheme for training contrastive
learning based SSL frameworks. Our experimental study on multiple image
classification datasets demonstrates the effectiveness of the proposed approach
over alternatives in the SSL literature.
Related papers
- Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Progressive Feature Adjustment for Semi-supervised Learning from
Pretrained Models [39.42802115580677]
Semi-supervised learning (SSL) can leverage both labeled and unlabeled data to build a predictive model.
Recent literature suggests that naively applying state-of-the-art SSL with a pretrained model fails to unleash the full potential of training data.
We propose to use pseudo-labels from the unlabelled data to update the feature extractor that is less sensitive to incorrect labels.
arXiv Detail & Related papers (2023-09-09T01:57:14Z) - Towards the Sparseness of Projection Head in Self-Supervised Learning [13.308675583018756]
We provide insights into the internal mechanisms of the projection head and its relationship with the phenomenon of dimensional collapse.
We introduce SparseHead - a regularization term that effectively constrains the sparsity of the projection head, and can be seamlessly integrated with any self-supervised learning (SSL) approaches.
arXiv Detail & Related papers (2023-07-18T01:16:23Z) - Augmentation-aware Self-supervised Learning with Conditioned Projector [6.720605329045581]
Self-supervised learning (SSL) is a powerful technique for learning from unlabeled data.
We propose to foster sensitivity to characteristics in the representation space by modifying the projector network.
Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods.
arXiv Detail & Related papers (2023-05-31T12:24:06Z) - Rethinking Self-Supervised Visual Representation Learning in
Pre-training for 3D Human Pose and Shape Estimation [57.206129938611454]
Self-supervised representation learning (SSL) methods have outperformed the ImageNet classification pre-training for vision tasks such as object detection.
We empirically study and analyze the effects of SSL and compare it with other pre-training alternatives for 3DHPSE.
Our observations challenge the naive application of the current SSL pre-training to 3DHPSE and relight the value of other data types in the pre-training aspect.
arXiv Detail & Related papers (2023-03-09T16:17:52Z) - Deciphering the Projection Head: Representation Evaluation
Self-supervised Learning [6.375931203397043]
Self-supervised learning (SSL) aims to learn intrinsic features without labels.
Projection head always plays an important role in improving the performance of the downstream task.
We propose a Representation Evaluation Design (RED) in SSL models in which a shortcut connection between the representation and the projection vectors is built.
arXiv Detail & Related papers (2023-01-28T13:13:53Z) - Improving Self-Supervised Learning by Characterizing Idealized
Representations [155.1457170539049]
We prove necessary and sufficient conditions for any task invariant to given data augmentations.
For contrastive learning, our framework prescribes simple but significant improvements to previous methods.
For non-contrastive learning, we use our framework to derive a simple and novel objective.
arXiv Detail & Related papers (2022-09-13T18:01:03Z) - Learning Where to Learn in Cross-View Self-Supervised Learning [54.14989750044489]
Self-supervised learning (SSL) has made enormous progress and largely narrowed the gap with supervised ones.
Current methods simply adopt uniform aggregation of pixels for embedding.
We present a new approach, Learning Where to Learn (LEWEL), to adaptively aggregate spatial information of features.
arXiv Detail & Related papers (2022-03-28T17:02:42Z) - UniSpeech-SAT: Universal Speech Representation Learning with Speaker
Aware Pre-Training [72.004873454347]
Two methods are introduced for enhancing the unsupervised speaker information extraction.
Experiment results on SUPERB benchmark show that the proposed system achieves state-of-the-art performance.
We scale up training dataset to 94 thousand hours public audio data and achieve further performance improvement.
arXiv Detail & Related papers (2021-10-12T05:43:30Z) - STDP enhances learning by backpropagation in a spiking neural network [0.0]
The proposed method improves the accuracy without additional labeling when a small amount of labeled data is used.
It is possible to implement the proposed learning method for event-driven systems.
arXiv Detail & Related papers (2021-02-21T06:55:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.