On the Robustness of Neural Collapse and the Neural Collapse of Robustness
- URL: http://arxiv.org/abs/2311.07444v2
- Date: Wed, 13 Nov 2024 14:39:42 GMT
- Title: On the Robustness of Neural Collapse and the Neural Collapse of Robustness
- Authors: Jingtong Su, Ya Shi Zhang, Nikolaos Tsilivis, Julia Kempe,
- Abstract summary: Neural Collapse refers to the curious phenomenon in the end of training of a neural network, where feature vectors and classification weights converge to a very simple geometrical arrangement (a simplex)
We study the stability properties of these simplices, and find that the simplex structure disappears under small adversarial attacks.
We identify novel properties of both robust and non-robust machine learning models, and show that earlier, unlike later layers maintain reliable simplices on perturbed data.
- Score: 6.227447957721122
- License:
- Abstract: Neural Collapse refers to the curious phenomenon in the end of training of a neural network, where feature vectors and classification weights converge to a very simple geometrical arrangement (a simplex). While it has been observed empirically in various cases and has been theoretically motivated, its connection with crucial properties of neural networks, like their generalization and robustness, remains unclear. In this work, we study the stability properties of these simplices. We find that the simplex structure disappears under small adversarial attacks, and that perturbed examples "leap" between simplex vertices. We further analyze the geometry of networks that are optimized to be robust against adversarial perturbations of the input, and find that Neural Collapse is a pervasive phenomenon in these cases as well, with clean and perturbed representations forming aligned simplices, and giving rise to a robust simple nearest-neighbor classifier. By studying the propagation of the amount of collapse inside the network, we identify novel properties of both robust and non-robust machine learning models, and show that earlier, unlike later layers maintain reliable simplices on perturbed data. Our code is available at https://github.com/JingtongSu/robust_neural_collapse .
Related papers
- Semantic Loss Functions for Neuro-Symbolic Structured Prediction [74.18322585177832]
We discuss the semantic loss, which injects knowledge about such structure, defined symbolically, into training.
It is agnostic to the arrangement of the symbols, and depends only on the semantics expressed thereby.
It can be combined with both discriminative and generative neural models.
arXiv Detail & Related papers (2024-05-12T22:18:25Z) - Navigate Beyond Shortcuts: Debiased Learning through the Lens of Neural Collapse [19.279084204631204]
We extend the investigation of Neural Collapse to the biased datasets with imbalanced attributes.
We propose an avoid-shortcut learning framework without additional training complexity.
With well-designed shortcut primes based on Neural Collapse structure, the models are encouraged to skip the pursuit of simple shortcuts.
arXiv Detail & Related papers (2024-05-09T07:23:37Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Neural Collapse: A Review on Modelling Principles and Generalization [0.0]
Neural collapse essentially represents a state at which the within-class variability of final hidden layer outputs is infinitesimally small.
Despite the simplicity of this state, the dynamics and implications of reaching it are yet to be fully understood.
arXiv Detail & Related papers (2022-06-08T17:55:28Z) - BScNets: Block Simplicial Complex Neural Networks [79.81654213581977]
Simplicial neural networks (SNN) have recently emerged as the newest direction in graph learning.
We present Block Simplicial Complex Neural Networks (BScNets) model for link prediction.
BScNets outperforms state-of-the-art models by a significant margin while maintaining low costs.
arXiv Detail & Related papers (2021-12-13T17:35:54Z) - Correlation Analysis between the Robustness of Sparse Neural Networks
and their Random Hidden Structural Priors [0.0]
We aim to investigate any existing correlations between graph theoretic properties and the robustness of Sparse Neural Networks.
Our hypothesis is, that graph theoretic properties as a prior of neural network structures are related to their robustness.
arXiv Detail & Related papers (2021-07-13T15:13:39Z) - Artificial Neural Networks generated by Low Discrepancy Sequences [59.51653996175648]
We generate artificial neural networks as random walks on a dense network graph.
Such networks can be trained sparse from scratch, avoiding the expensive procedure of training a dense network and compressing it afterwards.
We demonstrate that the artificial neural networks generated by low discrepancy sequences can achieve an accuracy within reach of their dense counterparts at a much lower computational complexity.
arXiv Detail & Related papers (2021-03-05T08:45:43Z) - Non-Singular Adversarial Robustness of Neural Networks [58.731070632586594]
Adrial robustness has become an emerging challenge for neural network owing to its over-sensitivity to small input perturbations.
We formalize the notion of non-singular adversarial robustness for neural networks through the lens of joint perturbations to data inputs as well as model weights.
arXiv Detail & Related papers (2021-02-23T20:59:30Z) - Adversarial Perturbations Are Not So Weird: Entanglement of Robust and
Non-Robust Features in Neural Network Classifiers [4.511923587827301]
We show that in a neural network trained in a standard way, non-robust features respond to small, "non-semantic" patterns.
adversarial examples can be formed via minimal perturbations to these small, entangled patterns.
arXiv Detail & Related papers (2021-02-09T20:21:31Z) - Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse
in Imbalanced Training [39.137793683411424]
We introduce the textitLayer-Peeled Model, a non-yet analytically tractable optimization program.
We show that the model inherits many characteristics of well-trained networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep learning training.
In particular, we show that the model reveals a hitherto unknown phenomenon that we term textitMinority Collapse, which fundamentally limits the performance of deep learning models on the minority classes.
arXiv Detail & Related papers (2021-01-29T17:37:17Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.