On the Robustness of Neural Collapse and the Neural Collapse of
Robustness
- URL: http://arxiv.org/abs/2311.07444v1
- Date: Mon, 13 Nov 2023 16:18:58 GMT
- Title: On the Robustness of Neural Collapse and the Neural Collapse of
Robustness
- Authors: Jingtong Su, Ya Shi Zhang, Nikolaos Tsilivis, Julia Kempe
- Abstract summary: Neural Collapse refers to the curious phenomenon in the end of training of a neural network, where feature vectors and classification weights converge to a very simple geometrical arrangement (a simplex)
We study the stability properties of these simplices, and find that the simplex structure disappears under small adversarial attacks.
We identify novel properties of both robust and non-robust machine learning models, and show that earlier, unlike later layers maintain reliable simplices on perturbed data.
- Score: 6.80303951699936
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural Collapse refers to the curious phenomenon in the end of training of a
neural network, where feature vectors and classification weights converge to a
very simple geometrical arrangement (a simplex). While it has been observed
empirically in various cases and has been theoretically motivated, its
connection with crucial properties of neural networks, like their
generalization and robustness, remains unclear. In this work, we study the
stability properties of these simplices. We find that the simplex structure
disappears under small adversarial attacks, and that perturbed examples "leap"
between simplex vertices. We further analyze the geometry of networks that are
optimized to be robust against adversarial perturbations of the input, and find
that Neural Collapse is a pervasive phenomenon in these cases as well, with
clean and perturbed representations forming aligned simplices, and giving rise
to a robust simple nearest-neighbor classifier. By studying the propagation of
the amount of collapse inside the network, we identify novel properties of both
robust and non-robust machine learning models, and show that earlier, unlike
later layers maintain reliable simplices on perturbed data.
Related papers
- Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training.
It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby.
It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z) - Navigate Beyond Shortcuts: Debiased Learning through the Lens of Neural Collapse [19.279084204631204]
We extend the investigation of Neural Collapse to the biased datasets with imbalanced attributes.
We propose an avoid-shortcut learning framework without additional training complexity.
With well-designed shortcut primes based on Neural Collapse structure, the models are encouraged to skip the pursuit of simple shortcuts.
arXiv Detail & Related papers (2024-05-09T07:23:37Z) - Neural Collapse: A Review on Modelling Principles and Generalization [0.0]
Neural collapse essentially represents a state at which the within-class variability of final hidden layer outputs is infinitesimally small.
Despite the simplicity of this state, the dynamics and implications of reaching it are yet to be fully understood.
arXiv Detail & Related papers (2022-06-08T17:55:28Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - BScNets: Block Simplicial Complex Neural Networks [79.81654213581977]
Simplicial neural networks (SNN) have recently emerged as the newest direction in graph learning.
We present Block Simplicial Complex Neural Networks (BScNets) model for link prediction.
BScNets outperforms state-of-the-art models by a significant margin while maintaining low costs.
arXiv Detail & Related papers (2021-12-13T17:35:54Z) - An Unconstrained Layer-Peeled Perspective on Neural Collapse [20.75423143311858]
We introduce a surrogate model called the unconstrained layer-peeled model (ULPM)
We prove that gradient flow on this model converges to critical points of a minimum-norm separation problem exhibiting neural collapse in its global minimizer.
We show that our results also hold during the training of neural networks in real-world tasks when explicit regularization or weight decay is not used.
arXiv Detail & Related papers (2021-10-06T14:18:47Z) - Correlation Analysis between the Robustness of Sparse Neural Networks
and their Random Hidden Structural Priors [0.0]
We aim to investigate any existing correlations between graph theoretic properties and the robustness of Sparse Neural Networks.
Our hypothesis is, that graph theoretic properties as a prior of neural network structures are related to their robustness.
arXiv Detail & Related papers (2021-07-13T15:13:39Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse
in Imbalanced Training [39.137793683411424]
We introduce the textitLayer-Peeled Model, a non-yet analytically tractable optimization program.
We show that the model inherits many characteristics of well-trained networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep learning training.
In particular, we show that the model reveals a hitherto unknown phenomenon that we term textitMinority Collapse, which fundamentally limits the performance of deep learning models on the minority classes.
arXiv Detail & Related papers (2021-01-29T17:37:17Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z) - Hyperbolic Neural Networks++ [66.16106727715061]
We generalize the fundamental components of neural networks in a single hyperbolic geometry model, namely, the Poincar'e ball model.
Experiments show the superior parameter efficiency of our methods compared to conventional hyperbolic components, and stability and outperformance over their Euclidean counterparts.
arXiv Detail & Related papers (2020-06-15T08:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.