Imbalance Trouble: Revisiting Neural-Collapse Geometry
- URL: http://arxiv.org/abs/2208.05512v1
- Date: Wed, 10 Aug 2022 18:10:59 GMT
- Title: Imbalance Trouble: Revisiting Neural-Collapse Geometry
- Authors: Christos Thrampoulidis, Ganesh R. Kini, Vala Vakilian, Tina Behnia
- Abstract summary: We introduce Simplex-Encoded-Labels Interpolation (SELI) as an invariant characterization of the neural collapse phenomenon.
We prove for the UFM with cross-entropy loss and vanishing regularization.
We present experiments on synthetic and real datasets that confirm convergence to the SELI geometry.
- Score: 27.21274327569783
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Neural Collapse refers to the remarkable structural properties characterizing
the geometry of class embeddings and classifier weights, found by deep nets
when trained beyond zero training error. However, this characterization only
holds for balanced data. Here we thus ask whether it can be made invariant to
class imbalances. Towards this end, we adopt the unconstrained-features model
(UFM), a recent theoretical model for studying neural collapse, and introduce
Simplex-Encoded-Labels Interpolation (SELI) as an invariant characterization of
the neural collapse phenomenon. Specifically, we prove for the UFM with
cross-entropy loss and vanishing regularization that, irrespective of class
imbalances, the embeddings and classifiers always interpolate a simplex-encoded
label matrix and that their individual geometries are determined by the SVD
factors of this same label matrix. We then present extensive experiments on
synthetic and real datasets that confirm convergence to the SELI geometry.
However, we caution that convergence worsens with increasing imbalances. We
theoretically support this finding by showing that unlike the balanced case,
when minorities are present, ridge-regularization plays a critical role in
tweaking the geometry. This defines new questions and motivates further
investigations into the impact of class imbalances on the rates at which
first-order methods converge to their asymptotically preferred solutions.
Related papers
- The Prevalence of Neural Collapse in Neural Multivariate Regression [3.691119072844077]
We show that neural networks exhibit Neural Collapse (NC) during the final stage of training for the classification problem.
To our knowledge, this is the first empirical and theoretical study of neural collapse in the context of regression.
arXiv Detail & Related papers (2024-09-06T10:45:58Z) - Neural Collapse for Cross-entropy Class-Imbalanced Learning with Unconstrained ReLU Feature Model [25.61363481391964]
We show that when the training dataset is class-imbalanced, some Neural Collapse (NC) properties will no longer be true.
In this paper, we generalize NC to imbalanced regime for cross-entropy loss under the unconstrained ReLU feature model.
We find that the weights are aligned to the scaled and centered class-means with scaling factors depend on the number of training samples of each class.
arXiv Detail & Related papers (2024-01-04T04:53:31Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Neural Collapse for Unconstrained Feature Model under Cross-entropy Loss
with Imbalanced Data [1.0152838128195467]
We study the extension of Neural Collapse (N C) phenomenon to imbalanced data under cross-entropy loss function.
Our contribution is multi-fold compared with the state-of-the-art results.
arXiv Detail & Related papers (2023-09-18T12:45:08Z) - Machine learning in and out of equilibrium [58.88325379746631]
Our study uses a Fokker-Planck approach, adapted from statistical physics, to explore these parallels.
We focus in particular on the stationary state of the system in the long-time limit, which in conventional SGD is out of equilibrium.
We propose a new variation of Langevin dynamics (SGLD) that harnesses without replacement minibatching.
arXiv Detail & Related papers (2023-06-06T09:12:49Z) - On the Implicit Geometry of Cross-Entropy Parameterizations for
Label-Imbalanced Data [26.310275682709776]
Various logit-adjusted parameterizations of the cross-entropy (CE) loss have been proposed as alternatives to weighted CE large models on labelimbalanced data.
We show that logit-adjusted parameterizations can be appropriately tuned to learn to learn irrespective of the minority imbalance ratio.
arXiv Detail & Related papers (2023-03-14T03:04:37Z) - Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class
Incremental Learning [120.53458753007851]
Few-shot class-incremental learning (FSCIL) has been a challenging problem as only a few training samples are accessible for each novel class in the new sessions.
We deal with this misalignment dilemma in FSCIL inspired by the recently discovered phenomenon named neural collapse.
We propose a neural collapse inspired framework for FSCIL. Experiments on the miniImageNet, CUB-200, and CIFAR-100 datasets demonstrate that our proposed framework outperforms the state-of-the-art performances.
arXiv Detail & Related papers (2023-02-06T18:39:40Z) - Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced
Data [12.225207401994737]
We show that complex systems with massive amounts of parameters exhibit the same structural properties when training until convergence.
In particular, it has been observed that the last-layer features collapse to their class-means.
Our results demonstrate the convergence of the last-layer features and classifiers to a geometry consisting of vectors.
arXiv Detail & Related papers (2023-01-01T16:29:56Z) - Do We Really Need a Learnable Classifier at the End of Deep Neural
Network? [118.18554882199676]
We study the potential of learning a neural network for classification with the classifier randomly as an ETF and fixed during training.
Our experimental results show that our method is able to achieve similar performances on image classification for balanced datasets.
arXiv Detail & Related papers (2022-03-17T04:34:28Z) - Mitigating Generation Shifts for Generalized Zero-Shot Learning [52.98182124310114]
Generalized Zero-Shot Learning (GZSL) is the task of leveraging semantic information (e.g., attributes) to recognize the seen and unseen samples, where unseen classes are not observable during training.
We propose a novel Generation Shifts Mitigating Flow framework for learning unseen data synthesis efficiently and effectively.
Experimental results demonstrate that GSMFlow achieves state-of-the-art recognition performance in both conventional and generalized zero-shot settings.
arXiv Detail & Related papers (2021-07-07T11:43:59Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.