Error analysis for deep neural network approximations of parametric
hyperbolic conservation laws
- URL: http://arxiv.org/abs/2207.07362v1
- Date: Fri, 15 Jul 2022 09:21:09 GMT
- Title: Error analysis for deep neural network approximations of parametric
hyperbolic conservation laws
- Authors: Tim De Ryck, Siddhartha Mishra
- Abstract summary: We show that the approximation error can be made as small as desired with ReLU neural networks.
We provide an explicit upper bound on the generalization error in terms of the training error, number of training samples and the neural network size.
- Score: 7.6146285961466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We derive rigorous bounds on the error resulting from the approximation of
the solution of parametric hyperbolic scalar conservation laws with ReLU neural
networks. We show that the approximation error can be made as small as desired
with ReLU neural networks that overcome the curse of dimensionality. In
addition, we provide an explicit upper bound on the generalization error in
terms of the training error, number of training samples and the neural network
size. The theoretical results are illustrated by numerical experiments.
Related papers
- Error Estimation for Physics-informed Neural Networks Approximating
Semilinear Wave Equations [15.834703258232006]
This paper provides rigorous error bounds for physics-informed neural networks approximating the semilinear wave equation.
We provide bounds for the generalization and training error in terms of the width of the network's layers and the number of training points for a tanh neural network with two hidden layers.
arXiv Detail & Related papers (2024-02-11T10:50:20Z) - Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural
Networks [89.28881869440433]
This paper provides the first theoretical characterization of joint edge-model sparse learning for graph neural networks (GNNs)
It proves analytically that both sampling important nodes and pruning neurons with the lowest-magnitude can reduce the sample complexity and improve convergence without compromising the test accuracy.
arXiv Detail & Related papers (2023-02-06T16:54:20Z) - Learning Distributions by Generative Adversarial Networks: Approximation
and Generalization [0.6768558752130311]
We study how well generative adversarial networks learn from finite samples by analyzing the convergence rates of these models.
Our analysis is based on a new inequality oracle that decomposes the estimation error of GAN into the discriminator and generator approximation errors.
For generator approximation error, we show that neural network can approximately transform a low-dimensional source distribution to a high-dimensional target distribution.
arXiv Detail & Related papers (2022-05-25T09:26:17Z) - Error estimates for physics informed neural networks approximating the
Navier-Stokes equations [6.445605125467574]
We show that the underlying PDE residual can be made arbitrarily small for tanh neural networks with two hidden layers.
The total error can be estimated in terms of the training error, network size and number of quadrature points.
arXiv Detail & Related papers (2022-03-17T14:26:17Z) - Generalization Error Bounds for Iterative Recovery Algorithms Unfolded
as Neural Networks [6.173968909465726]
We introduce a general class of neural networks suitable for sparse reconstruction from few linear measurements.
By allowing a wide range of degrees of weight-sharing between the layers, we enable a unified analysis for very different neural network types.
arXiv Detail & Related papers (2021-12-08T16:17:33Z) - Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent.
We show that SGD is biased towards a simple solution.
We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - On the approximation of functions by tanh neural networks [0.0]
We derive bounds on the error, in high-order Sobolev norms, incurred in the approximation of Sobolev-regular.
We show that tanh neural networks with only two hidden layers suffice to approximate functions at comparable or better rates than much deeper ReLU neural networks.
arXiv Detail & Related papers (2021-04-18T19:30:45Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand.
We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice.
Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.