Approximation Error and Complexity Bounds for ReLU Networks on Low-Regular Function Spaces
- URL: http://arxiv.org/abs/2405.06727v1
- Date: Fri, 10 May 2024 14:31:58 GMT
- Title: Approximation Error and Complexity Bounds for ReLU Networks on Low-Regular Function Spaces
- Authors: Owen Davis, Gianluca Geraci, Mohammad Motamed,
- Abstract summary: We consider the approximation of a large class of bounded functions, with minimal regularity assumptions, by ReLU neural networks.
We show that the approximation error can be bounded from above by a quantity proportional to the uniform norm of the target function.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we consider the approximation of a large class of bounded functions, with minimal regularity assumptions, by ReLU neural networks. We show that the approximation error can be bounded from above by a quantity proportional to the uniform norm of the target function and inversely proportional to the product of network width and depth. We inherit this approximation error bound from Fourier features residual networks, a type of neural network that uses complex exponential activation functions. Our proof is constructive and proceeds by conducting a careful complexity analysis associated with the approximation of a Fourier features residual network by a ReLU network.
Related papers
- ReLU neural network approximation to piecewise constant functions [3.5928501649873326]
We show that a three-layer ReLU NN is sufficient to accurately approximate any piecewise constant function.
If the discontinuity interface is convex, an analytical formula of the ReLU NN approximation with exact weights and biases is provided.
arXiv Detail & Related papers (2024-10-21T20:58:34Z) - Deep Learning without Global Optimization by Random Fourier Neural Networks [0.0]
We introduce a new training algorithm for variety of deep neural networks that utilize random complex exponential activation functions.
Our approach employs a Markov Chain Monte Carlo sampling procedure to iteratively train network layers.
It consistently attains the theoretical approximation rate for residual networks with complex exponential activation functions.
arXiv Detail & Related papers (2024-07-16T16:23:40Z) - A Mean-Field Analysis of Neural Stochastic Gradient Descent-Ascent for Functional Minimax Optimization [90.87444114491116]
This paper studies minimax optimization problems defined over infinite-dimensional function classes of overparametricized two-layer neural networks.
We address (i) the convergence of the gradient descent-ascent algorithm and (ii) the representation learning of the neural networks.
Results show that the feature representation induced by the neural networks is allowed to deviate from the initial one by the magnitude of $O(alpha-1)$, measured in terms of the Wasserstein distance.
arXiv Detail & Related papers (2024-04-18T16:46:08Z) - Generalization of Scaled Deep ResNets in the Mean-Field Regime [55.77054255101667]
We investigate emphscaled ResNet in the limit of infinitely deep and wide neural networks.
Our results offer new insights into the generalization ability of deep ResNet beyond the lazy training regime.
arXiv Detail & Related papers (2024-03-14T21:48:00Z) - Wide Deep Neural Networks with Gaussian Weights are Very Close to
Gaussian Processes [1.0878040851638]
We show that the distance between the network output and the corresponding Gaussian approximation scales inversely with the width of the network, exhibiting faster convergence than the naive suggested by the central limit theorem.
We also apply our bounds to obtain theoretical approximations for the exact posterior distribution of the network, when the likelihood is a bounded Lipschitz function of the network output evaluated on a (finite) training set.
arXiv Detail & Related papers (2023-12-18T22:29:40Z) - Optimal Approximation Complexity of High-Dimensional Functions with
Neural Networks [3.222802562733787]
We investigate properties of neural networks that use both ReLU and $x2$ as activation functions.
We show how to leverage low local dimensionality in some contexts to overcome the curse of dimensionality, obtaining approximation rates that are optimal for unknown lower-dimensional subspaces.
arXiv Detail & Related papers (2023-01-30T17:29:19Z) - Towards Understanding Theoretical Advantages of Complex-Reaction
Networks [77.34726150561087]
We show that a class of functions can be approximated by a complex-reaction network using the number of parameters.
For empirical risk minimization, our theoretical result shows that the critical point set of complex-reaction networks is a proper subset of that of real-valued networks.
arXiv Detail & Related papers (2021-08-15T10:13:49Z) - Sharp Lower Bounds on the Approximation Rate of Shallow Neural Networks [0.0]
We prove sharp lower bounds on the approximation rates for shallow neural networks.
These lower bounds apply to both sigmoidal activation functions with bounded variation and to activation functions which are a power of the ReLU.
arXiv Detail & Related papers (2021-06-28T22:01:42Z) - Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions.
We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z) - A Convergence Theory Towards Practical Over-parameterized Deep Neural
Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time.
We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both.
Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z) - Measuring Model Complexity of Neural Networks with Curve Activation
Functions [100.98319505253797]
We propose the linear approximation neural network (LANN) to approximate a given deep model with curve activation function.
We experimentally explore the training process of neural networks and detect overfitting.
We find that the $L1$ and $L2$ regularizations suppress the increase of model complexity.
arXiv Detail & Related papers (2020-06-16T07:38:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.