Related papers: Quantifying the Variability Collapse of Neural Networks

Quantifying the Variability Collapse of Neural Networks

URL: http://arxiv.org/abs/2306.03440v1
Date: Tue, 6 Jun 2023 06:37:07 GMT
Title: Quantifying the Variability Collapse of Neural Networks
Authors: Jing Xu, Haoxiong Liu
Abstract summary: Recently discovered Neural Collapse (NC) phenomenon provides a new perspective of understanding such last layer geometry of neural networks. We propose a novel metric, named Variability Collapse Index (VCI), to quantify the variability collapse phenomenon in the NC paradigm.
Score: 2.9551667607781607
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent studies empirically demonstrate the positive relationship between the transferability of neural networks and the within-class variation of the last layer features. The recently discovered Neural Collapse (NC) phenomenon provides a new perspective of understanding such last layer geometry of neural networks. In this paper, we propose a novel metric, named Variability Collapse Index (VCI), to quantify the variability collapse phenomenon in the NC paradigm. The VCI metric is well-motivated and intrinsically related to the linear probing loss on the last layer features. Moreover, it enjoys desired theoretical and empirical properties, including invariance under invertible linear transformations and numerical stability, that distinguishes it from previous metrics. Our experiments verify that VCI is indicative of the variability collapse and the transferability of pretrained neural networks.

Related papers

Understanding Artificial Neural Network's Behavior from Neuron Activation Perspective [8.251799609350725]
This paper explores the intricate behavior of deep neural networks (DNNs) through the lens of neuron activation dynamics. We propose a probabilistic framework that can analyze models' neuron activation patterns as a process.
arXiv Detail & Related papers (2024-12-24T01:01:06Z)
Advancing the Understanding of Fixed Point Iterations in Deep Neural Networks: A Detailed Analytical Study [23.991344681741058]
We conduct a detailed analysis of fixed point iterations in a vector-valued function modeled by neural networks. We establish a sufficient condition for the existence of multiple fixed points of looped neural networks based on varying input regions. Our methodology may enhance our comprehension of neural network mechanisms.
arXiv Detail & Related papers (2024-10-15T04:57:02Z)
Neural Rank Collapse: Weight Decay and Small Within-Class Variability Yield Low-Rank Bias [4.829265670567825]
We show the presence of an intriguing neural rank collapse phenomenon, connecting the low-rank bias of trained networks with networks' neural collapse properties. As the weight decay parameter grows, the rank of each layer in the network decreases proportionally to the within-class variability of the hidden-space embeddings of the previous layers.
arXiv Detail & Related papers (2024-02-06T13:44:39Z)
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory. Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z)
Interrelation of equivariant Gaussian processes and convolutional neural networks [77.34726150561087]
Currently there exists rather promising new trend in machine leaning (ML) based on the relationship between neural networks (NN) and Gaussian processes (GP) In this work we establish a relationship between the many-channel limit for CNNs equivariant with respect to two-dimensional Euclidean group with vector-valued neuron activations and the corresponding independently introduced equivariant Gaussian processes (GP)
arXiv Detail & Related papers (2022-09-17T17:02:35Z)
Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN) VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers. In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z)
A comparative study of back propagation and its alternatives on multilayer perceptrons [0.0]
The de facto algorithm for training the back pass of a feedforward neural network is backpropagation (BP) The use of almost-everywhere differentiable activation functions made it efficient and effective to propagate the gradient backwards through layers of deep neural networks. In this paper, we analyze the stability and similarity of predictions and neurons in convolutional neural networks (CNNs) and propose a new variation of one of the algorithms.
arXiv Detail & Related papers (2022-05-31T18:44:13Z)
Gradient Starvation: A Learning Proclivity in Neural Networks [97.02382916372594]
Gradient Starvation arises when cross-entropy loss is minimized by capturing only a subset of features relevant for the task. This work provides a theoretical explanation for the emergence of such feature imbalance in neural networks.
arXiv Detail & Related papers (2020-11-18T18:52:08Z)
Geometry Perspective Of Estimating Learning Capability Of Neural Networks [0.0]
The paper considers a broad class of neural networks with generalized architecture performing simple least square regression with gradient descent (SGD) The relationship between the generalization capability with the stability of the neural network has also been discussed. By correlating the principles of high-energy physics with the learning theory of neural networks, the paper establishes a variant of the Complexity-Action conjecture from an artificial neural network perspective.
arXiv Detail & Related papers (2020-11-03T12:03:19Z)
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs) In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit. We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
Tangent Space Sensitivity and Distribution of Linear Regions in ReLU Networks [0.0]
We consider adversarial stability in the tangent space and suggest tangent sensitivity in order to characterize stability. We derive several easily computable bounds and empirical measures for feed-forward fully connected ReLU networks. Our experiments suggest that even simple bounds and measures are associated with the empirical generalization gap.
arXiv Detail & Related papers (2020-06-11T20:02:51Z)
Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix. Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.