Role Taxonomy of Units in Deep Neural Networks
- URL: http://arxiv.org/abs/2011.00789v2
- Date: Mon, 20 Nov 2023 08:48:07 GMT
- Title: Role Taxonomy of Units in Deep Neural Networks
- Authors: Yang Zhao, Hao Zhang and Xiuyuan Hu
- Abstract summary: We identify the role of network units in deep neural networks (DNNs) via the retrieval-of-function test.
We show that ratios of the four categories are highly associated with the generalization ability of DNNs from two distinct perspectives.
- Score: 15.067182415076148
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Identifying the role of network units in deep neural networks (DNNs) is
critical in many aspects including giving understandings on the mechanisms of
DNNs and building basic connections between deep learning and neuroscience.
However, there remains unclear on which roles the units in DNNs with different
generalization ability could present. To this end, we give role taxonomy of
units in DNNs via introducing the retrieval-of-function test, where units are
categorized into four types in terms of their functional preference on
separately the training set and testing set. We show that ratios of the four
categories are highly associated with the generalization ability of DNNs from
two distinct perspectives, based on which we give signs of DNNs with well
generalization.
Related papers
- Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features [68.3512123520931]
We investigate the dynamics of a deep neural network (DNN) learning interactions.
In this paper, we discover the DNN learns interactions in two phases.
The first phase mainly penalizes interactions of medium and high orders, and the second phase mainly learns interactions of gradually increasing orders.
arXiv Detail & Related papers (2024-05-16T17:13:25Z) - Deep Neural Networks via Complex Network Theory: a Perspective [3.1023851130450684]
Deep Neural Networks (DNNs) can be represented as graphs whose links and vertices iteratively process data and solve tasks sub-optimally. Complex Network Theory (CNT), merging statistical physics with graph theory, provides a method for interpreting neural networks by analysing their weights and neuron structures.
In this work, we extend the existing CNT metrics with measures that sample from the DNNs' training distribution, shifting from a purely topological analysis to one that connects with the interpretability of deep learning.
arXiv Detail & Related papers (2024-04-17T08:42:42Z) - Rethinking the Relationship between Recurrent and Non-Recurrent Neural Networks: A Study in Sparsity [0.0]
We show that many common neural network models, such as Recurrent Neural Networks (RNN), can all be represented as iterative maps.
RNNs are known to be Turing complete, and therefore capable of representing any computable function.
This perspective leads to several insights that illuminate both theoretical and practical aspects of NNs.
arXiv Detail & Related papers (2024-04-01T03:18:42Z) - Transferability of coVariance Neural Networks and Application to
Interpretable Brain Age Prediction using Anatomical Features [119.45320143101381]
Graph convolutional networks (GCN) leverage topology-driven graph convolutional operations to combine information across the graph for inference tasks.
We have studied GCNs with covariance matrices as graphs in the form of coVariance neural networks (VNNs)
VNNs inherit the scale-free data processing architecture from GCNs and here, we show that VNNs exhibit transferability of performance over datasets whose covariance matrices converge to a limit object.
arXiv Detail & Related papers (2023-05-02T22:15:54Z) - Tensor Networks Meet Neural Networks: A Survey and Future Perspectives [27.878669143107885]
tensorial neural networks (TNNs) and neural networks (NNs) are two fundamental data modeling approaches.
TNs solve the curse of dimensionality in large-scale tensors by converting an exponential number of dimensions to complexity.
NNs have displayed exceptional performance in various applications, e.g., computer vision, natural language processing, and robotics research.
arXiv Detail & Related papers (2023-01-22T17:35:56Z) - Graph Neural Networks are Inherently Good Generalizers: Insights by
Bridging GNNs and MLPs [71.93227401463199]
This paper pinpoints the major source of GNNs' performance gain to their intrinsic capability, by introducing an intermediate model class dubbed as P(ropagational)MLP.
We observe that PMLPs consistently perform on par with (or even exceed) their GNN counterparts, while being much more efficient in training.
arXiv Detail & Related papers (2022-12-18T08:17:32Z) - Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp)
In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks.
We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z) - Deep Neural Networks as Complex Networks [1.704936863091649]
We use Complex Network Theory to represent Deep Neural Networks (DNNs) as directed weighted graphs.
We introduce metrics to study DNNs as dynamical systems, with a granularity that spans from weights to layers, including neurons.
We show that our metrics discriminate low vs. high performing networks.
arXiv Detail & Related papers (2022-09-12T16:26:04Z) - On Feature Learning in Neural Networks with Global Convergence
Guarantees [49.870593940818715]
We study the optimization of wide neural networks (NNs) via gradient flow (GF)
We show that when the input dimension is no less than the size of the training set, the training loss converges to zero at a linear rate under GF.
We also show empirically that, unlike in the Neural Tangent Kernel (NTK) regime, our multi-layer model exhibits feature learning and can achieve better generalization performance than its NTK counterpart.
arXiv Detail & Related papers (2022-04-22T15:56:43Z) - Confidence Dimension for Deep Learning based on Hoeffding Inequality and
Relative Evaluation [44.393256948610016]
We propose to use multiple factors to measure and rank the relative generalization of deep neural networks (DNNs) based on a new concept of confidence dimension (CD)
Our CD yields a consistent and reliable measure and ranking for both full-precision DNNs and binary neural networks (BNNs) on all the tasks.
arXiv Detail & Related papers (2022-03-17T04:43:43Z) - Exploiting Heterogeneity in Operational Neural Networks by Synaptic
Plasticity [87.32169414230822]
Recently proposed network model, Operational Neural Networks (ONNs), can generalize the conventional Convolutional Neural Networks (CNNs)
In this study the focus is drawn on searching the best-possible operator set(s) for the hidden neurons of the network based on the Synaptic Plasticity paradigm that poses the essential learning theory in biological neurons.
Experimental results over highly challenging problems demonstrate that the elite ONNs even with few neurons and layers can achieve a superior learning performance than GIS-based ONNs.
arXiv Detail & Related papers (2020-08-21T19:03:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.