Lower Bounds on the Generalization Error of Nonlinear Learning Models
- URL: http://arxiv.org/abs/2103.14723v1
- Date: Fri, 26 Mar 2021 20:37:54 GMT
- Title: Lower Bounds on the Generalization Error of Nonlinear Learning Models
- Authors: Inbar Seroussi, Ofer Zeitouni
- Abstract summary: We study in this paper lower bounds for the generalization error of models derived from multi-layer neural networks, in the regime where the size of the layers is commensurate with the number of samples in the training data.
We show that unbiased estimators have unacceptable performance for such nonlinear networks in this regime.
We derive explicit generalization lower bounds for general biased estimators, in the cases of linear regression and of two-layered networks.
- Score: 2.1030878979833467
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study in this paper lower bounds for the generalization error of models
derived from multi-layer neural networks, in the regime where the size of the
layers is commensurate with the number of samples in the training data. We show
that unbiased estimators have unacceptable performance for such nonlinear
networks in this regime. We derive explicit generalization lower bounds for
general biased estimators, in the cases of linear regression and of two-layered
networks. In the linear case the bound is asymptotically tight. In the
nonlinear case, we provide a comparison of our bounds with an empirical study
of the stochastic gradient descent algorithm. The analysis uses elements from
the theory of large random matrices.
Related papers
- Generalization for Least Squares Regression With Simple Spiked Covariances [3.9134031118910264]
The generalization properties of even two-layer neural networks trained by gradient descent remain poorly understood.
Recent work has made progress by describing the spectrum of the feature matrix at the hidden layer.
Yet, the generalization error for linear models with spiked covariances has not been previously determined.
arXiv Detail & Related papers (2024-10-17T19:46:51Z) - Classification of Data Generated by Gaussian Mixture Models Using Deep
ReLU Networks [28.437011792990347]
This paper studies the binary classification of data from $math RMs. generated under Gaussian Mixture networks.
We obtain $d2013x neural analysis rates for the first time convergence rates.
Results provide a theoretical verification of deep neural networks in practical classification problems.
arXiv Detail & Related papers (2023-08-15T20:40:42Z) - Learning Linear Causal Representations from Interventions under General
Nonlinear Mixing [52.66151568785088]
We prove strong identifiability results given unknown single-node interventions without access to the intervention targets.
This is the first instance of causal identifiability from non-paired interventions for deep neural network embeddings.
arXiv Detail & Related papers (2023-06-04T02:32:12Z) - Fast Convergence in Learning Two-Layer Neural Networks with Separable
Data [37.908159361149835]
We study normalized gradient descent on two-layer neural nets.
We prove for exponentially-tailed losses that using normalized GD leads to linear rate of convergence of the training loss to the global optimum.
arXiv Detail & Related papers (2023-05-22T20:30:10Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - Adaptive deep learning for nonlinear time series models [0.0]
We develop a theory for adaptive nonparametric estimation of the mean function of a non-stationary and nonlinear time series model using deep neural networks (DNNs)
We derive minimax lower bounds for estimating mean functions belonging to a wide class of nonlinear autoregressive (AR) models.
arXiv Detail & Related papers (2022-07-06T09:58:58Z) - The Interplay Between Implicit Bias and Benign Overfitting in Two-Layer
Linear Networks [51.1848572349154]
neural network models that perfectly fit noisy data can generalize well to unseen test data.
We consider interpolating two-layer linear neural networks trained with gradient flow on the squared loss and derive bounds on the excess risk.
arXiv Detail & Related papers (2021-08-25T22:01:01Z) - Hessian Eigenspectra of More Realistic Nonlinear Models [73.31363313577941]
We make a emphprecise characterization of the Hessian eigenspectra for a broad family of nonlinear models.
Our analysis takes a step forward to identify the origin of many striking features observed in more complex machine learning models.
arXiv Detail & Related papers (2021-03-02T06:59:52Z) - Dimension Free Generalization Bounds for Non Linear Metric Learning [61.193693608166114]
We provide uniform generalization bounds for two regimes -- the sparse regime, and a non-sparse regime.
We show that by relying on a different, new property of the solutions, it is still possible to provide dimension free generalization guarantees.
arXiv Detail & Related papers (2021-02-07T14:47:00Z) - Learning Fast Approximations of Sparse Nonlinear Regression [50.00693981886832]
In this work, we bridge the gap by introducing the Threshold Learned Iterative Shrinkage Algorithming (NLISTA)
Experiments on synthetic data corroborate our theoretical results and show our method outperforms state-of-the-art methods.
arXiv Detail & Related papers (2020-10-26T11:31:08Z) - Generalization Error of Generalized Linear Models in High Dimensions [25.635225717360466]
We provide a framework to characterize neural networks with arbitrary non-linearities.
We analyze the effect of regular logistic regression on learning.
Our model also captures examples between training and distributions special cases.
arXiv Detail & Related papers (2020-05-01T02:17:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.