Related papers: Estimating the Generalization in Deep Neural Networks via Sparsity

Estimating the Generalization in Deep Neural Networks via Sparsity

URL: http://arxiv.org/abs/2104.00851v3
Date: Mon, 20 Nov 2023 08:50:33 GMT
Title: Estimating the Generalization in Deep Neural Networks via Sparsity
Authors: Yang Zhao and Hao Zhang
Abstract summary: We propose a novel method for estimating the generalization gap based on network sparsity. By training DNNs with a wide range of generalization gap on popular datasets, we show that our key quantities and linear model could be efficient tools for estimating the generalization gap of DNNs.
Score: 15.986873241115651
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generalization is the key capability for deep neural networks (DNNs). However, it is challenging to give a reliable measure of the generalization ability of a DNN via only its nature. In this paper, we propose a novel method for estimating the generalization gap based on network sparsity. In our method, two key quantities are proposed first. They have close relationship with the generalization ability and can be calculated directly from the training results alone. Then a simple linear model involving two key quantities are constructed to give accurate estimation of the generalization gap. By training DNNs with a wide range of generalization gap on popular datasets, we show that our key quantities and linear model could be efficient tools for estimating the generalization gap of DNNs.

Related papers

On Generalization Bounds for Deep Compound Gaussian Neural Networks [1.4425878137951238]
Unrolled deep neural networks (DNNs) provide better interpretability and superior empirical performance than standard DNNs. We develop novel generalization error bounds for a class of unrolled DNNs informed by a compound Gaussian prior. Under realistic conditions, we show that, at worst, the generalization error scales $mathcalO(nsqrt(n))$ in the signal dimension and $mathcalO(($Network Size$)3/2)$ in network size.
arXiv Detail & Related papers (2024-02-20T16:01:39Z)
Learning Expressive Priors for Generalization and Uncertainty Estimation in Neural Networks [77.89179552509887]
We propose a novel prior learning method for advancing generalization and uncertainty estimation in deep neural networks. The key idea is to exploit scalable and structured posteriors of neural networks as informative priors with generalization guarantees. We exhaustively show the effectiveness of this method for uncertainty estimation and generalization.
arXiv Detail & Related papers (2023-07-15T09:24:33Z)
Towards Understanding the Generalization of Graph Neural Networks [9.217947432437546]
Graph neural networks (GNNs) are the most widely adopted model in graph-structured data oriented learning and representation. We first establish high probability bounds of generalization gap and gradients in transductive learning. The theoretical results reveal the architecture specific factors affecting the generalization gap.
arXiv Detail & Related papers (2023-05-14T03:05:14Z)
Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime. We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z)
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs [71.93227401463199]
This paper pinpoints the major source of GNNs' performance gain to their intrinsic capability, by introducing an intermediate model class dubbed as P(ropagational)MLP. We observe that PMLPs consistently perform on par with (or even exceed) their GNN counterparts, while being much more efficient in training.
arXiv Detail & Related papers (2022-12-18T08:17:32Z)
Towards Better Out-of-Distribution Generalization of Neural Algorithmic Reasoning Tasks [51.8723187709964]
We study the OOD generalization of neural algorithmic reasoning tasks. The goal is to learn an algorithm from input-output pairs using deep neural networks.
arXiv Detail & Related papers (2022-11-01T18:33:20Z)
Why Robust Generalization in Deep Learning is Difficult: Perspective of Expressive Power [15.210336733607488]
We show that for binary classification problems with well-separated data, there exists a constant robust generalization gap unless the size of the neural network is exponential. We establish an improved upper bound of $exp(mathcalO(k))$ for the network size to achieve low robust generalization error.
arXiv Detail & Related papers (2022-05-27T09:53:04Z)
Confidence Dimension for Deep Learning based on Hoeffding Inequality and Relative Evaluation [44.393256948610016]
We propose to use multiple factors to measure and rank the relative generalization of deep neural networks (DNNs) based on a new concept of confidence dimension (CD) Our CD yields a consistent and reliable measure and ranking for both full-precision DNNs and binary neural networks (BNNs) on all the tasks.
arXiv Detail & Related papers (2022-03-17T04:43:43Z)
A Theoretical-Empirical Approach to Estimating Sample Complexity of DNNs [11.152761263415046]
This paper focuses on understanding how the generalization error scales with the amount of the training data for deep neural networks (DNNs) We derive estimates of the generalization error that hold for deep networks and do not rely on unattainable capacity measures.
arXiv Detail & Related papers (2021-05-05T05:14:08Z)
A Survey on Assessing the Generalization Envelope of Deep Neural Networks: Predictive Uncertainty, Out-of-distribution and Adversarial Samples [77.99182201815763]
Deep Neural Networks (DNNs) achieve state-of-the-art performance on numerous applications. It is difficult to tell beforehand if a DNN receiving an input will deliver the correct output since their decision criteria are usually nontransparent. This survey connects the three fields within the larger framework of investigating the generalization performance of machine learning methods and in particular DNNs.
arXiv Detail & Related papers (2020-08-21T09:12:52Z)
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice. We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.