Statistical Guarantees for Approximate Stationary Points of Simple
Neural Networks
- URL: http://arxiv.org/abs/2205.04491v1
- Date: Mon, 9 May 2022 18:09:04 GMT
- Title: Statistical Guarantees for Approximate Stationary Points of Simple
Neural Networks
- Authors: Mahsa Taheri, Fang Xie, Johannes Lederer
- Abstract summary: We develop statistical guarantees for simple neural networks that coincide up to logarithmic factors with the global optima.
We make a step forward in describing the practical properties of neural networks in mathematical terms.
- Score: 4.254099382808598
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Since statistical guarantees for neural networks are usually restricted to
global optima of intricate objective functions, it is not clear whether these
theories really explain the performances of actual outputs of neural-network
pipelines. The goal of this paper is, therefore, to bring statistical theory
closer to practice. We develop statistical guarantees for simple neural
networks that coincide up to logarithmic factors with the global optima but
apply to stationary points and the points nearby. These results support the
common notion that neural networks do not necessarily need to be optimized
globally from a mathematical perspective. More generally, despite being limited
to simple neural networks for now, our theories make a step forward in
describing the practical properties of neural networks in mathematical terms.
Related papers
- Interpreting Neural Networks through Mahalanobis Distance [0.0]
This paper introduces a theoretical framework that connects neural network linear layers with the Mahalanobis distance.
Although this work is theoretical and does not include empirical data, the proposed distance-based interpretation has the potential to enhance model robustness, improve generalization, and provide more intuitive explanations of neural network decisions.
arXiv Detail & Related papers (2024-10-25T07:21:44Z) - Jaynes Machine: The universal microstructure of deep neural networks [0.9086679566009702]
We predict that all highly connected layers of deep neural networks have a universal microstructure of connection strengths that is distributed lognormally ($LN(mu, sigma)$).
Under ideal conditions, the theory predicts that $mu$ and $sigma$ are the same for all layers in all networks.
arXiv Detail & Related papers (2023-10-10T19:22:01Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Correlation between entropy and generalizability in a neural network [9.223853439465582]
We use Wang-Landau Mote Carlo algorithm to calculate the entropy at a given test accuracy.
Our results show that entropical forces help generalizability.
arXiv Detail & Related papers (2022-07-05T12:28:13Z) - Consistency of Neural Networks with Regularization [0.0]
This paper proposes the general framework of neural networks with regularization and prove its consistency.
Two types of activation functions: hyperbolic function(Tanh) and rectified linear unit(ReLU) have been taken into consideration.
arXiv Detail & Related papers (2022-06-22T23:33:39Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - What can linearized neural networks actually say about generalization? [67.83999394554621]
In certain infinitely-wide neural networks, the neural tangent kernel (NTK) theory fully characterizes generalization.
We show that the linear approximations can indeed rank the learning complexity of certain tasks for neural networks.
Our work provides concrete examples of novel deep learning phenomena which can inspire future theoretical research.
arXiv Detail & Related papers (2021-06-12T13:05:11Z) - Provably Training Neural Network Classifiers under Fairness Constraints [70.64045590577318]
We show that overparametrized neural networks could meet the constraints.
Key ingredient of building a fair neural network classifier is establishing no-regret analysis for neural networks.
arXiv Detail & Related papers (2020-12-30T18:46:50Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Generalization bound of globally optimal non-convex neural network
training: Transportation map estimation by infinite dimensional Langevin
dynamics [50.83356836818667]
We introduce a new theoretical framework to analyze deep learning optimization with connection to its generalization error.
Existing frameworks such as mean field theory and neural tangent kernel theory for neural network optimization analysis typically require taking limit of infinite width of the network to show its global convergence.
arXiv Detail & Related papers (2020-07-11T18:19:50Z) - Statistical Guarantees for Regularized Neural Networks [4.254099382808598]
We develop a general statistical guarantee for estimators that consist of a least-squares term and a regularizer.
Our results establish a mathematical basis for regularized estimation of neural networks.
arXiv Detail & Related papers (2020-05-30T15:28:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.