Neural collapse with unconstrained features
- URL: http://arxiv.org/abs/2011.11619v1
- Date: Mon, 23 Nov 2020 18:49:36 GMT
- Title: Neural collapse with unconstrained features
- Authors: Dustin G. Mixon, Hans Parshall, Jianzong Pi
- Abstract summary: We propose a simple "unconstrained features model" in which neural collapse also emerges empirically.
By studying this model, we provide some explanation for the emergence of neural collapse in terms of the landscape of empirical risk.
- Score: 4.941630596191806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural collapse is an emergent phenomenon in deep learning that was recently
discovered by Papyan, Han and Donoho. We propose a simple "unconstrained
features model" in which neural collapse also emerges empirically. By studying
this model, we provide some explanation for the emergence of neural collapse in
terms of the landscape of empirical risk.
Related papers
- The Persistence of Neural Collapse Despite Low-Rank Bias: An Analytic Perspective Through Unconstrained Features [0.0]
Deep neural networks exhibit a simple structure in their final layer features and weights, commonly referred to as neural collapse.
Recent findings indicate that such a structure is generally not optimal in the deep unconstrained feature model.
This is attributed to a low-rank bias induced by regularization, which favors solutions with lower-rank than those typically associated with deep neural collapse.
arXiv Detail & Related papers (2024-10-30T16:20:39Z) - Confidence Regulation Neurons in Language Models [91.90337752432075]
This study investigates the mechanisms by which large language models represent and regulate uncertainty in next-token predictions.
Entropy neurons are characterized by an unusually high weight norm and influence the final layer normalization (LayerNorm) scale to effectively scale down the logits.
token frequency neurons, which we describe here for the first time, boost or suppress each token's logit proportionally to its log frequency, thereby shifting the output distribution towards or away from the unigram distribution.
arXiv Detail & Related papers (2024-06-24T01:31:03Z) - On the Robustness of Neural Collapse and the Neural Collapse of
Robustness [6.80303951699936]
Neural Collapse refers to the curious phenomenon in the end of training of a neural network, where feature vectors and classification weights converge to a very simple geometrical arrangement (a simplex)
We study the stability properties of these simplices, and find that the simplex structure disappears under small adversarial attacks.
We identify novel properties of both robust and non-robust machine learning models, and show that earlier, unlike later layers maintain reliable simplices on perturbed data.
arXiv Detail & Related papers (2023-11-13T16:18:58Z) - Generalized Neural Collapse for a Large Number of Classes [33.46269920297418]
We provide empirical study to verify the occurrence of generalized neural collapse in practical deep neural networks.
We provide theoretical study to show that the generalized neural collapse provably occurs under unconstrained feature model with spherical constraint.
arXiv Detail & Related papers (2023-10-09T02:27:04Z) - Catastrophic overfitting can be induced with discriminative non-robust
features [95.07189577345059]
We study the onset of CO in single-step AT methods through controlled modifications of typical datasets of natural images.
We show that CO can be induced at much smaller $epsilon$ values than it was observed before just by injecting images with seemingly innocuous features.
arXiv Detail & Related papers (2022-06-16T15:22:39Z) - Neural Collapse: A Review on Modelling Principles and Generalization [0.0]
Neural collapse essentially represents a state at which the within-class variability of final hidden layer outputs is infinitesimally small.
Despite the simplicity of this state, the dynamics and implications of reaching it are yet to be fully understood.
arXiv Detail & Related papers (2022-06-08T17:55:28Z) - Limitations of Neural Collapse for Understanding Generalization in Deep
Learning [25.48346719747956]
Recent work of Papyan, Han, & Donoho presented an intriguing "Neural Collapse" phenomenon.
Our motivation is to study the upper limits of this research program.
arXiv Detail & Related papers (2022-02-17T00:20:12Z) - Extended Unconstrained Features Model for Exploring Deep Neural Collapse [59.59039125375527]
Recently, a phenomenon termed "neural collapse" (NC) has been empirically observed in deep neural networks.
Recent papers have shown that minimizers with this structure emerge when optimizing a simplified "unconstrained features model"
In this paper, we study the UFM for the regularized MSE loss, and show that the minimizers' features can be more structured than in the cross-entropy case.
arXiv Detail & Related papers (2022-02-16T14:17:37Z) - The Causal Neural Connection: Expressiveness, Learnability, and
Inference [125.57815987218756]
An object called structural causal model (SCM) represents a collection of mechanisms and sources of random variation of the system under investigation.
In this paper, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020) still holds for neural models.
We introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences.
arXiv Detail & Related papers (2021-07-02T01:55:18Z) - ACRE: Abstract Causal REasoning Beyond Covariation [90.99059920286484]
We introduce the Abstract Causal REasoning dataset for systematic evaluation of current vision systems in causal induction.
Motivated by the stream of research on causal discovery in Blicket experiments, we query a visual reasoning system with the following four types of questions in either an independent scenario or an interventional scenario.
We notice that pure neural models tend towards an associative strategy under their chance-level performance, whereas neuro-symbolic combinations struggle in backward-blocking reasoning.
arXiv Detail & Related papers (2021-03-26T02:42:38Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.