Quantifying the Variability Collapse of Neural Networks
- URL: http://arxiv.org/abs/2306.03440v1
- Date: Tue, 6 Jun 2023 06:37:07 GMT
- Title: Quantifying the Variability Collapse of Neural Networks
- Authors: Jing Xu, Haoxiong Liu
- Abstract summary: Recently discovered Neural Collapse (NC) phenomenon provides a new perspective of understanding such last layer geometry of neural networks.
We propose a novel metric, named Variability Collapse Index (VCI), to quantify the variability collapse phenomenon in the NC paradigm.
- Score: 2.9551667607781607
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies empirically demonstrate the positive relationship between the
transferability of neural networks and the within-class variation of the last
layer features. The recently discovered Neural Collapse (NC) phenomenon
provides a new perspective of understanding such last layer geometry of neural
networks. In this paper, we propose a novel metric, named Variability Collapse
Index (VCI), to quantify the variability collapse phenomenon in the NC
paradigm. The VCI metric is well-motivated and intrinsically related to the
linear probing loss on the last layer features. Moreover, it enjoys desired
theoretical and empirical properties, including invariance under invertible
linear transformations and numerical stability, that distinguishes it from
previous metrics. Our experiments verify that VCI is indicative of the
variability collapse and the transferability of pretrained neural networks.
Related papers
- Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study [23.991344681741058]
We conduct a detailed analysis of fixed point iterations in a vector-valued function modeled by neural networks.
We establish a sufficient condition for the existence of multiple fixed points of looped neural networks based on varying input regions.
Our methodology may enhance our comprehension of neural network mechanisms.
arXiv Detail & Related papers (2024-10-15T04:57:02Z) - Neural Rank Collapse: Weight Decay and Small Within-Class Variability
Yield Low-Rank Bias [4.829265670567825]
We show the presence of an intriguing neural rank collapse phenomenon, connecting the low-rank bias of trained networks with networks' neural collapse properties.
As the weight decay parameter grows, the rank of each layer in the network decreases proportionally to the within-class variability of the hidden-space embeddings of the previous layers.
arXiv Detail & Related papers (2024-02-06T13:44:39Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Interrelation of equivariant Gaussian processes and convolutional neural
networks [77.34726150561087]
Currently there exists rather promising new trend in machine leaning (ML) based on the relationship between neural networks (NN) and Gaussian processes (GP)
In this work we establish a relationship between the many-channel limit for CNNs equivariant with respect to two-dimensional Euclidean group with vector-valued neuron activations and the corresponding independently introduced equivariant Gaussian processes (GP)
arXiv Detail & Related papers (2022-09-17T17:02:35Z) - Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN)
VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers.
In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z) - A comparative study of back propagation and its alternatives on
multilayer perceptrons [0.0]
The de facto algorithm for training the back pass of a feedforward neural network is backpropagation (BP)
The use of almost-everywhere differentiable activation functions made it efficient and effective to propagate the gradient backwards through layers of deep neural networks.
In this paper, we analyze the stability and similarity of predictions and neurons in convolutional neural networks (CNNs) and propose a new variation of one of the algorithms.
arXiv Detail & Related papers (2022-05-31T18:44:13Z) - Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task.
This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z) - Geometry Perspective Of Estimating Learning Capability Of Neural
Networks [0.0]
The paper considers a broad class of neural networks with generalized architecture performing simple least square regression with gradient descent (SGD)
The relationship between the generalization capability with the stability of the neural network has also been discussed.
By correlating the principles of high-energy physics with the learning theory of neural networks, the paper establishes a variant of the Complexity-Action conjecture from an artificial neural network perspective.
arXiv Detail & Related papers (2020-11-03T12:03:19Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - Tangent Space Sensitivity and Distribution of Linear Regions in ReLU
Networks [0.0]
We consider adversarial stability in the tangent space and suggest tangent sensitivity in order to characterize stability.
We derive several easily computable bounds and empirical measures for feed-forward fully connected ReLU networks.
Our experiments suggest that even simple bounds and measures are associated with the empirical generalization gap.
arXiv Detail & Related papers (2020-06-11T20:02:51Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.