On neural and dimensional collapse in supervised and unsupervised
contrastive learning with hard negative sampling
- URL: http://arxiv.org/abs/2311.05139v1
- Date: Thu, 9 Nov 2023 04:40:32 GMT
- Title: On neural and dimensional collapse in supervised and unsupervised
contrastive learning with hard negative sampling
- Authors: Ruijie Jiang, Thuan Nguyen, Shuchin Aeron, Prakash Ishwar
- Abstract summary: We prove that representations that exhibit Neural Collapse (NC) minimize Supervised Contrastive Learning (SCL), Hard-SCL (HSCL), and Unsupervised Contrastive Learning (UCL) risks.
We also prove that for any representation mapping, the HSCL and Hard-UCL (HUCL) risks are lower bounded by the corresponding SCL and UCL risks.
- Score: 17.94266316310016
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For a widely-studied data model and general loss and sample-hardening
functions we prove that the Supervised Contrastive Learning (SCL), Hard-SCL
(HSCL), and Unsupervised Contrastive Learning (UCL) risks are minimized by
representations that exhibit Neural Collapse (NC), i.e., the class means form
an Equianglular Tight Frame (ETF) and data from the same class are mapped to
the same representation. We also prove that for any representation mapping, the
HSCL and Hard-UCL (HUCL) risks are lower bounded by the corresponding SCL and
UCL risks. Although the optimality of ETF is known for SCL, albeit only for
InfoNCE loss, its optimality for HSCL and UCL under general loss and hardening
functions is novel. Moreover, our proofs are much simpler, compact, and
transparent. We empirically demonstrate, for the first time, that ADAM
optimization of HSCL and HUCL risks with random initialization and suitable
hardness levels can indeed converge to the NC geometry if we incorporate
unit-ball or unit-sphere feature normalization. Without incorporating hard
negatives or feature normalization, however, the representations learned via
ADAM suffer from dimensional collapse (DC) and fail to attain the NC geometry.
Related papers
- Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks.
This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks.
We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z) - Uncertainty-guided Boundary Learning for Imbalanced Social Event
Detection [64.4350027428928]
We propose a novel uncertainty-guided class imbalance learning framework for imbalanced social event detection tasks.
Our model significantly improves social event representation and classification tasks in almost all classes, especially those uncertain ones.
arXiv Detail & Related papers (2023-10-30T03:32:04Z) - Symmetric Neural-Collapse Representations with Supervised Contrastive
Loss: The Impact of ReLU and Batching [26.994954303270575]
Supervised contrastive loss (SCL) is a competitive and often superior alternative to the cross-entropy loss for classification.
While prior studies have demonstrated that both losses yield symmetric training representations under balanced data, this symmetry breaks under class imbalances.
This paper presents an intriguing discovery: the introduction of a ReLU activation at the final layer effectively restores the symmetry in SCL-learned representations.
arXiv Detail & Related papers (2023-06-13T17:55:39Z) - Enhancing Adversarial Contrastive Learning via Adversarial Invariant
Regularization [59.77647907277523]
Adversarial contrastive learning (ACL) is a technique that enhances standard contrastive learning (SCL)
In this paper, we propose adversarial invariant regularization (AIR) to enforce independence from style factors.
arXiv Detail & Related papers (2023-04-30T03:12:21Z) - Supervised Contrastive Learning with Hard Negative Samples [16.42457033976047]
In contrastive learning (CL) learns a useful representation function by pulling positive samples close to each other.
In absence of class information, negative samples are chosen randomly and independently of the anchor.
Supervised CL (SCL) avoids this class collision by conditioning the negative sampling distribution to samples having labels different from that of the anchor.
arXiv Detail & Related papers (2022-08-31T19:20:04Z) - Hierarchical Semi-Supervised Contrastive Learning for
Contamination-Resistant Anomaly Detection [81.07346419422605]
Anomaly detection aims at identifying deviant samples from the normal data distribution.
Contrastive learning has provided a successful way to sample representation that enables effective discrimination on anomalies.
We propose a novel hierarchical semi-supervised contrastive learning framework, for contamination-resistant anomaly detection.
arXiv Detail & Related papers (2022-07-24T18:49:26Z) - An Asymmetric Contrastive Loss for Handling Imbalanced Datasets [0.0]
We introduce an asymmetric version of CL, referred to as ACL, to address the problem of class imbalance.
In addition, we propose the asymmetric focal contrastive loss (AFCL) as a further generalization of both ACL and focal contrastive loss.
Results on the FMNIST and ISIC 2018 imbalanced datasets show that AFCL is capable of outperforming CL and FCL in terms of both weighted and unweighted classification accuracies.
arXiv Detail & Related papers (2022-07-14T17:30:13Z) - Interventional Contrastive Learning with Meta Semantic Regularizer [28.708395209321846]
Contrastive learning (CL)-based self-supervised learning models learn visual representations in a pairwise manner.
When the CL model is trained with full images, the performance tested in full images is better than that in foreground areas.
When the CL model is trained with foreground areas, the performance tested in full images is worse than that in foreground areas.
arXiv Detail & Related papers (2022-06-29T15:02:38Z) - Integrating Prior Knowledge in Contrastive Learning with Kernel [4.050766659420731]
We use kernel theory to propose a novel loss, called decoupled uniformity, that i) allows the integration of prior knowledge and ii) removes the negative-positive coupling in the original InfoNCE loss.
In an unsupervised setting, we empirically demonstrate that CL benefits from generative models to improve its representation both on natural and medical images.
arXiv Detail & Related papers (2022-06-03T15:43:08Z) - When Does Contrastive Learning Preserve Adversarial Robustness from
Pretraining to Finetuning? [99.4914671654374]
We propose AdvCL, a novel adversarial contrastive pretraining framework.
We show that AdvCL is able to enhance cross-task robustness transferability without loss of model accuracy and finetuning efficiency.
arXiv Detail & Related papers (2021-11-01T17:59:43Z) - Unbiased Risk Estimators Can Mislead: A Case Study of Learning with
Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels.
We show that the quality of gradient estimation matters more in risk minimization.
We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.