Related papers: Symmetric Neural-Collapse Representations with Supervised Contrastive Loss: The Impact of ReLU and Batching

Symmetric Neural-Collapse Representations with Supervised Contrastive Loss: The Impact of ReLU and Batching

URL: http://arxiv.org/abs/2306.07960v2
Date: Wed, 18 Oct 2023 19:48:48 GMT
Title: Symmetric Neural-Collapse Representations with Supervised Contrastive Loss: The Impact of ReLU and Batching
Authors: Ganesh Ramachandra Kini, Vala Vakilian, Tina Behnia, Jaidev Gill, Christos Thrampoulidis
Abstract summary: Supervised contrastive loss (SCL) is a competitive and often superior alternative to the cross-entropy loss for classification. While prior studies have demonstrated that both losses yield symmetric training representations under balanced data, this symmetry breaks under class imbalances. This paper presents an intriguing discovery: the introduction of a ReLU activation at the final layer effectively restores the symmetry in SCL-learned representations.
Score: 26.994954303270575
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Supervised contrastive loss (SCL) is a competitive and often superior alternative to the cross-entropy loss for classification. While prior studies have demonstrated that both losses yield symmetric training representations under balanced data, this symmetry breaks under class imbalances. This paper presents an intriguing discovery: the introduction of a ReLU activation at the final layer effectively restores the symmetry in SCL-learned representations. We arrive at this finding analytically, by establishing that the global minimizers of an unconstrained features model with SCL loss and entry-wise non-negativity constraints form an orthogonal frame. Extensive experiments conducted across various datasets, architectures, and imbalance scenarios corroborate our finding. Importantly, our experiments reveal that the inclusion of the ReLU activation restores symmetry without compromising test accuracy. This constitutes the first geometry characterization of SCL under imbalances. Additionally, our analysis and experiments underscore the pivotal role of batch selection strategies in representation geometry. By proving necessary and sufficient conditions for mini-batch choices that ensure invariant symmetric representations, we introduce batch-binding as an efficient strategy that guarantees these conditions hold.

Related papers

Joint Asymmetric Loss for Learning with Noisy Labels [95.14298444251044]
symmetric losses usually suffer from the underfitting issue due to the overly strict constraint.<n>Within APL, symmetric losses have been successfully extended, yielding advanced robust loss functions.<n>We introduce a novel robust loss framework termed Joint Asymmetric Loss (JAL)
arXiv Detail & Related papers (2025-07-23T16:57:43Z)
Self-Supervised Contrastive Learning is Approximately Supervised Contrastive Learning [48.11265601808718]
We show that standard self-supervised contrastive learning objectives implicitly approximate a supervised variant we call the negatives-only supervised contrastive loss (NSCL)<n>We prove that the gap between the CL and NSCL losses vanishes as the number of semantic classes increases, under a bound that is both label-agnostic and architecture-independent.
arXiv Detail & Related papers (2025-06-04T19:43:36Z)
Predicting symmetries of quantum dynamics with optimal samples [41.42817348756889]
Identifying symmetries in quantum dynamics is a crucial challenge with profound implications for quantum technologies. We introduce a unified framework combining group representation theory and subgroup hypothesis testing to predict these symmetries with optimal efficiency. We prove that parallel strategies achieve the same performance as adaptive or indefinite-causal-order protocols.
arXiv Detail & Related papers (2025-02-03T15:57:50Z)
Learning Broken Symmetries with Approximate Invariance [1.0485739694839669]
In many cases, the exact underlying symmetry is present only in an idealized dataset, and is broken in actual data. Standard approaches, such as data augmentation or equivariant networks fail to represent the nature of the full, broken symmetry. We propose a learning model which balances the generality and performance of unconstrained networks with the rapid learning of constrained networks.
arXiv Detail & Related papers (2024-12-25T04:29:04Z)
The Common Stability Mechanism behind most Self-Supervised Learning Approaches [64.40701218561921]
We provide a framework to explain the stability mechanism of different self-supervised learning techniques. We discuss the working mechanism of contrastive techniques like SimCLR, non-contrastive techniques like BYOL, SWAV, SimSiam, Barlow Twins, and DINO. We formulate different hypotheses and test them using the Imagenet100 dataset.
arXiv Detail & Related papers (2024-02-22T20:36:24Z)
Hard-Negative Sampling for Contrastive Learning: Optimal Representation Geometry and Neural- vs Dimensional-Collapse [16.42457033976047]
We prove that the losses of Supervised Contrastive Learning (SCL), Hard-SCL (HSCL), and Unsupervised Contrastive Learning (UCL) are minimized by representations that exhibit Neural-Collapse (NC) We also prove that for any representation mapping, the HSCL and Hard-UCL (HUCL) losses are lower bounded by the corresponding SCL and UCL losses.
arXiv Detail & Related papers (2023-11-09T04:40:32Z)
On the Implicit Geometry of Cross-Entropy Parameterizations for Label-Imbalanced Data [26.310275682709776]
Various logit-adjusted parameterizations of the cross-entropy (CE) loss have been proposed as alternatives to weighted CE large models on labelimbalanced data. We show that logit-adjusted parameterizations can be appropriately tuned to learn to learn irrespective of the minority imbalance ratio.
arXiv Detail & Related papers (2023-03-14T03:04:37Z)
A Unified Framework for Contrastive Learning from a Perspective of Affinity Matrix [80.2675125037624]
We present a new unified contrastive learning representation framework (named UniCLR) suitable for all the above four kinds of methods. Three variants, i.e., SimAffinity, SimWhitening and SimTrace, are presented based on UniCLR. In addition, a simple symmetric loss, as a new consistency regularization term, is proposed based on this framework.
arXiv Detail & Related papers (2022-11-26T08:55:30Z)
Imbalance Trouble: Revisiting Neural-Collapse Geometry [27.21274327569783]
We introduce Simplex-Encoded-Labels Interpolation (SELI) as an invariant characterization of the neural collapse phenomenon. We prove for the UFM with cross-entropy loss and vanishing regularization. We present experiments on synthetic and real datasets that confirm convergence to the SELI geometry.
arXiv Detail & Related papers (2022-08-10T18:10:59Z)
An Asymmetric Contrastive Loss for Handling Imbalanced Datasets [0.0]
We introduce an asymmetric version of CL, referred to as ACL, to address the problem of class imbalance. In addition, we propose the asymmetric focal contrastive loss (AFCL) as a further generalization of both ACL and focal contrastive loss. Results on the FMNIST and ISIC 2018 imbalanced datasets show that AFCL is capable of outperforming CL and FCL in terms of both weighted and unweighted classification accuracies.
arXiv Detail & Related papers (2022-07-14T17:30:13Z)
Semi-supervised Contrastive Learning with Similarity Co-calibration [72.38187308270135]
We propose a novel training strategy, termed as Semi-supervised Contrastive Learning (SsCL) SsCL combines the well-known contrastive loss in self-supervised learning with the cross entropy loss in semi-supervised learning. We show that SsCL produces more discriminative representation and is beneficial to few shot learning.
arXiv Detail & Related papers (2021-05-16T09:13:56Z)
Center Prediction Loss for Re-identification [65.58923413172886]
We propose a new loss based on center predictivity, that is, a sample must be positioned in a location of the feature space such that from it we can roughly predict the location of the center of same-class samples. We show that this new loss leads to a more flexible intra-class distribution constraint while ensuring the between-class samples are well-separated.
arXiv Detail & Related papers (2021-04-30T03:57:31Z)
Label-Imbalanced and Group-Sensitive Classification under Overparameterization [32.923780772605596]
Label-imbalanced and group-sensitive classification seeks to appropriately modify standard training algorithms to optimize relevant metrics. We show that a logit-adjusted loss modification to standard empirical risk minimization might be ineffective in general. We show that our results extend naturally to binary classification with sensitive groups, thus treating the two common types of imbalances (label/group) in a unifying way.
arXiv Detail & Related papers (2021-03-02T08:09:43Z)
A Symmetric Loss Perspective of Reliable Machine Learning [87.68601212686086]
We review how a symmetric loss can yield robust classification from corrupted labels in balanced error rate (BER) minimization. We demonstrate how the robust AUC method can benefit natural language processing in the problem where we want to learn only from relevant keywords.
arXiv Detail & Related papers (2021-01-05T06:25:47Z)
Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels [92.98756432746482]
We study a weakly supervised problem called learning with complementary labels. We show that the quality of gradient estimation matters more in risk minimization. We propose a novel surrogate complementary loss(SCL) framework that trades zero bias with reduced variance.
arXiv Detail & Related papers (2020-07-05T04:19:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.