Symmetric Neural-Collapse Representations with Supervised Contrastive
Loss: The Impact of ReLU and Batching
- URL: http://arxiv.org/abs/2306.07960v2
- Date: Wed, 18 Oct 2023 19:48:48 GMT
- Title: Symmetric Neural-Collapse Representations with Supervised Contrastive
Loss: The Impact of ReLU and Batching
- Authors: Ganesh Ramachandra Kini, Vala Vakilian, Tina Behnia, Jaidev Gill,
Christos Thrampoulidis
- Abstract summary: Supervised contrastive loss (SCL) is a competitive and often superior alternative to the cross-entropy loss for classification.
While prior studies have demonstrated that both losses yield symmetric training representations under balanced data, this symmetry breaks under class imbalances.
This paper presents an intriguing discovery: the introduction of a ReLU activation at the final layer effectively restores the symmetry in SCL-learned representations.
- Score: 26.994954303270575
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Supervised contrastive loss (SCL) is a competitive and often superior
alternative to the cross-entropy loss for classification. While prior studies
have demonstrated that both losses yield symmetric training representations
under balanced data, this symmetry breaks under class imbalances. This paper
presents an intriguing discovery: the introduction of a ReLU activation at the
final layer effectively restores the symmetry in SCL-learned representations.
We arrive at this finding analytically, by establishing that the global
minimizers of an unconstrained features model with SCL loss and entry-wise
non-negativity constraints form an orthogonal frame. Extensive experiments
conducted across various datasets, architectures, and imbalance scenarios
corroborate our finding. Importantly, our experiments reveal that the inclusion
of the ReLU activation restores symmetry without compromising test accuracy.
This constitutes the first geometry characterization of SCL under imbalances.
Additionally, our analysis and experiments underscore the pivotal role of batch
selection strategies in representation geometry. By proving necessary and
sufficient conditions for mini-batch choices that ensure invariant symmetric
representations, we introduce batch-binding as an efficient strategy that
guarantees these conditions hold.
Related papers
- The Common Stability Mechanism behind most Self-Supervised Learning
Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques.
We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO.
We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z) - Hard-Negative Sampling for Contrastive Learning: Optimal Representation Geometry and Neural- vs Dimensional-Collapse [16.42457033976047]
We prove that the losses of Supervised Contrastive Learning (SCL), Hard-SCL (HSCL), and Unsupervised Contrastive Learning (UCL) are minimized by representations that exhibit Neural-Collapse (NC)
We also prove that for any representation mapping, the HSCL and Hard-UCL (HUCL) losses are lower bounded by the corresponding SCL and UCL losses.
arXiv Detail & Related papers (2023-11-09T04:40:32Z) - On the Implicit Geometry of Cross-Entropy Parameterizations for
Label-Imbalanced Data [26.310275682709776]
Various logit-adjusted parameterizations of the cross-entropy (CE) loss have been proposed as alternatives to weighted CE large models on labelimbalanced data.
We show that logit-adjusted parameterizations can be appropriately tuned to learn to learn irrespective of the minority imbalance ratio.
arXiv Detail & Related papers (2023-03-14T03:04:37Z) - A Unified Framework for Contrastive Learning from a Perspective of
Affinity Matrix [80.2675125037624]
We present a new unified contrastive learning representation framework (named UniCLR) suitable for all the above four kinds of methods.
Three variants, i.e., SimAffinity, SimWhitening and SimTrace, are presented based on UniCLR.
In addition, a simple symmetric loss, as a new consistency regularization term, is proposed based on this framework.
arXiv Detail & Related papers (2022-11-26T08:55:30Z) - Imbalance Trouble: Revisiting Neural-Collapse Geometry [27.21274327569783]
We introduce Simplex-Encoded-Labels Interpolation (SELI) as an invariant characterization of the neural collapse phenomenon.
We prove for the UFM with cross-entropy loss and vanishing regularization.
We present experiments on synthetic and real datasets that confirm convergence to the SELI geometry.
arXiv Detail & Related papers (2022-08-10T18:10:59Z) - An Asymmetric Contrastive Loss for Handling Imbalanced Datasets [0.0]
We introduce an asymmetric version of CL, referred to as ACL, to address the problem of class imbalance.
In addition, we propose the asymmetric focal contrastive loss (AFCL) as a further generalization of both ACL and focal contrastive loss.
Results on the FMNIST and ISIC 2018 imbalanced datasets show that AFCL is capable of outperforming CL and FCL in terms of both weighted and unweighted classification accuracies.
arXiv Detail & Related papers (2022-07-14T17:30:13Z) - Semi-supervised Contrastive Learning with Similarity Co-calibration [72.38187308270135]
We propose a novel training strategy, termed as Semi-supervised Contrastive Learning (SsCL)
SsCL combines the well-known contrastive loss in self-supervised learning with the cross entropy loss in semi-supervised learning.
We show that SsCL produces more discriminative representation and is beneficial to few shot learning.
arXiv Detail & Related papers (2021-05-16T09:13:56Z) - Center Prediction Loss for Re-identification [65.58923413172886]
We propose a new loss based on center predictivity, that is, a sample must be positioned in a location of the feature space such that from it we can roughly predict the location of the center of same-class samples.
We show that this new loss leads to a more flexible intra-class distribution constraint while ensuring the between-class samples are well-separated.
arXiv Detail & Related papers (2021-04-30T03:57:31Z) - Label-Imbalanced and Group-Sensitive Classification under
Overparameterization [32.923780772605596]
Label-imbalanced and group-sensitive classification seeks to appropriately modify standard training algorithms to optimize relevant metrics.
We show that a logit-adjusted loss modification to standard empirical risk minimization might be ineffective in general.
We show that our results extend naturally to binary classification with sensitive groups, thus treating the two common types of imbalances (label/group) in a unifying way.
arXiv Detail & Related papers (2021-03-02T08:09:43Z) - A Symmetric Loss Perspective of Reliable Machine Learning [87.68601212686086]
We review how a symmetric loss can yield robust classification from corrupted labels in balanced error rate (BER) minimization.
We demonstrate how the robust AUC method can benefit natural language processing in the problem where we want to learn only from relevant keywords.
arXiv Detail & Related papers (2021-01-05T06:25:47Z) - Unbiased Risk Estimators Can Mislead: A Case Study of Learning with
Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels.
We show that the quality of gradient estimation matters more in risk minimization.
We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.