Randomness of Low-Layer Parameters Determines Confusing Samples in Terms of Interaction Representations of a DNN
- URL: http://arxiv.org/abs/2502.08625v1
- Date: Wed, 12 Feb 2025 18:25:13 GMT
- Title: Randomness of Low-Layer Parameters Determines Confusing Samples in Terms of Interaction Representations of a DNN
- Authors: Junpeng Zhang, Lei Cheng, Qing Li, Liang Lin, Quanshi Zhang,
- Abstract summary: We find that the complexity of interactions encoded by a deep neural network (DNN) can explain its generalization power.
We also discover that the confusing samples of a DNN, which are represented by non-generalizable interactions, are determined by its low-layer parameters.
- Score: 67.80700786901016
- License:
- Abstract: In this paper, we find that the complexity of interactions encoded by a deep neural network (DNN) can explain its generalization power. We also discover that the confusing samples of a DNN, which are represented by non-generalizable interactions, are determined by its low-layer parameters. In comparison, other factors, such as high-layer parameters and network architecture, have much less impact on the composition of confusing samples. Two DNNs with different low-layer parameters usually have fully different sets of confusing samples, even though they have similar performance. This finding extends the understanding of the lottery ticket hypothesis, and well explains distinctive representation power of different DNNs.
Related papers
- Learning local discrete features in explainable-by-design convolutional neural networks [0.0]
We introduce an explainable-by-design convolutional neural network (CNN) based on the lateral inhibition mechanism.
The model consists of the predictor, that is a high-accuracy CNN with residual or dense skip connections.
By collecting observations and directly calculating probabilities, we can explain causal relationships between motifs of adjacent levels.
arXiv Detail & Related papers (2024-10-31T18:39:41Z) - How Uniform Random Weights Induce Non-uniform Bias: Typical Interpolating Neural Networks Generalize with Narrow Teachers [37.54951110709193]
We show that a flat' prior over the NN parameterization induces a rich prior over the NN functions.
This creates a bias towards simpler functions, which require less relevant parameters to represent.
arXiv Detail & Related papers (2024-02-09T11:03:52Z) - Interpretable Neural Networks with Random Constructive Algorithm [3.1200894334384954]
This paper introduces an Interpretable Neural Network (INN) incorporating spatial information to tackle the opaque parameterization process of random weighted neural networks.
It devises a geometric relationship strategy using a pool of candidate nodes and established relationships to select node parameters conducive to network convergence.
arXiv Detail & Related papers (2023-07-01T01:07:20Z) - Where We Have Arrived in Proving the Emergence of Sparse Symbolic Concepts in AI Models [21.02716340199201]
This study aims to prove the emergence of symbolic concepts (or more precisely, sparse primitive inference patterns) in well-trained deep neural networks (DNNs)
We show that inference scores of the DNN on an exponentially large number of randomly masked samples can always be well mimicked by numerical effects of just a few interactions.
arXiv Detail & Related papers (2023-05-03T07:32:28Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Discovering and Explaining the Representation Bottleneck of DNNs [21.121270460158712]
This paper explores the bottleneck of feature representations of deep neural networks (DNNs)
We focus on the multi-order interaction between input variables, where the order represents the complexity of interactions.
We discover that a DNN is more likely to encode both too simple interactions and too complex interactions, but usually fails to learn interactions of intermediate complexity.
arXiv Detail & Related papers (2021-11-11T14:35:20Z) - Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks.
We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z) - Post-mortem on a deep learning contest: a Simpson's paradox and the
complementary roles of scale metrics versus shape metrics [61.49826776409194]
We analyze a corpus of models made publicly-available for a contest to predict the generalization accuracy of neural network (NN) models.
We identify what amounts to a Simpson's paradox: where "scale" metrics perform well overall but perform poorly on sub partitions of the data.
We present two novel shape metrics, one data-independent, and the other data-dependent, which can predict trends in the test accuracy of a series of NNs.
arXiv Detail & Related papers (2021-06-01T19:19:49Z) - ACDC: Weight Sharing in Atom-Coefficient Decomposed Convolution [57.635467829558664]
We introduce a structural regularization across convolutional kernels in a CNN.
We show that CNNs now maintain performance with dramatic reduction in parameters and computations.
arXiv Detail & Related papers (2020-09-04T20:41:47Z) - GraN: An Efficient Gradient-Norm Based Detector for Adversarial and
Misclassified Examples [77.99182201815763]
Deep neural networks (DNNs) are vulnerable to adversarial examples and other data perturbations.
GraN is a time- and parameter-efficient method that is easily adaptable to any DNN.
GraN achieves state-of-the-art performance on numerous problem set-ups.
arXiv Detail & Related papers (2020-04-20T10:09:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.