Estimating the Generalization in Deep Neural Networks via Sparsity
- URL: http://arxiv.org/abs/2104.00851v3
- Date: Mon, 20 Nov 2023 08:50:33 GMT
- Title: Estimating the Generalization in Deep Neural Networks via Sparsity
- Authors: Yang Zhao and Hao Zhang
- Abstract summary: We propose a novel method for estimating the generalization gap based on network sparsity.
By training DNNs with a wide range of generalization gap on popular datasets, we show that our key quantities and linear model could be efficient tools for estimating the generalization gap of DNNs.
- Score: 15.986873241115651
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generalization is the key capability for deep neural networks (DNNs).
However, it is challenging to give a reliable measure of the generalization
ability of a DNN via only its nature. In this paper, we propose a novel method
for estimating the generalization gap based on network sparsity. In our method,
two key quantities are proposed first. They have close relationship with the
generalization ability and can be calculated directly from the training results
alone. Then a simple linear model involving two key quantities are constructed
to give accurate estimation of the generalization gap. By training DNNs with a
wide range of generalization gap on popular datasets, we show that our key
quantities and linear model could be efficient tools for estimating the
generalization gap of DNNs.
Related papers
- On Generalization Bounds for Deep Compound Gaussian Neural Networks [1.4425878137951238]
Unrolled deep neural networks (DNNs) provide better interpretability and superior empirical performance than standard DNNs.
We develop novel generalization error bounds for a class of unrolled DNNs informed by a compound Gaussian prior.
Under realistic conditions, we show that, at worst, the generalization error scales $mathcalO(nsqrt(n))$ in the signal dimension and $mathcalO(($Network Size$)3/2)$ in network size.
arXiv Detail & Related papers (2024-02-20T16:01:39Z) - Learning Expressive Priors for Generalization and Uncertainty Estimation
in Neural Networks [77.89179552509887]
We propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks.
The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees.
We exhaustively show the effectiveness of this method for uncertainty estimation and generalization.
arXiv Detail & Related papers (2023-07-15T09:24:33Z) - Towards Understanding the Generalization of Graph Neural Networks [9.217947432437546]
Graph neural networks (GNNs) are the most widely adopted model in graph-structured data oriented learning and representation.
We first establish high probability bounds of generalization gap and gradients in transductive learning.
The theoretical results reveal the architecture specific factors affecting the generalization gap.
arXiv Detail & Related papers (2023-05-14T03:05:14Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - Graph Neural Networks are Inherently Good Generalizers: Insights by
Bridging GNNs and MLPs [71.93227401463199]
This paper pinpoints the major source of GNNs' performance gain to their intrinsic capability, by introducing an intermediate model class dubbed as P(ropagational)MLP.
We observe that PMLPs consistently perform on par with (or even exceed) their GNN counterparts, while being much more efficient in training.
arXiv Detail & Related papers (2022-12-18T08:17:32Z) - Towards Better Out-of-Distribution Generalization of Neural Algorithmic
Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks.
The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z) - Why Robust Generalization in Deep Learning is Difficult: Perspective of
Expressive Power [15.210336733607488]
We show that for binary classification problems with well-separated data, there exists a constant robust generalization gap unless the size of the neural network is exponential.
We establish an improved upper bound of $exp(mathcalO(k))$ for the network size to achieve low robust generalization error.
arXiv Detail & Related papers (2022-05-27T09:53:04Z) - Confidence Dimension for Deep Learning based on Hoeffding Inequality and
Relative Evaluation [44.393256948610016]
We propose to use multiple factors to measure and rank the relative generalization of deep neural networks (DNNs) based on a new concept of confidence dimension (CD)
Our CD yields a consistent and reliable measure and ranking for both full-precision DNNs and binary neural networks (BNNs) on all the tasks.
arXiv Detail & Related papers (2022-03-17T04:43:43Z) - A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs [11.152761263415046]
This paper focuses on understanding how the generalization error scales with the amount of the training data for deep neural networks (DNNs)
We derive estimates of the generalization error that hold for deep networks and do not rely on unattainable capacity measures.
arXiv Detail & Related papers (2021-05-05T05:14:08Z) - A Survey on Assessing the Generalization Envelope of Deep Neural
Networks: Predictive Uncertainty, Out-of-distribution and Adversarial Samples [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art performance on numerous applications.
It is difficult to tell beforehand if a DNN receiving an input will deliver the correct output since their decision criteria are usually nontransparent.
This survey connects the three fields within the larger framework of investigating the generalization performance of machine learning methods and in particular DNNs.
arXiv Detail & Related papers (2020-08-21T09:12:52Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.