Neural Network Approximation of Continuous Functions in High Dimensions
with Applications to Inverse Problems
- URL: http://arxiv.org/abs/2208.13305v3
- Date: Tue, 10 Oct 2023 04:45:56 GMT
- Title: Neural Network Approximation of Continuous Functions in High Dimensions
with Applications to Inverse Problems
- Authors: Santhosh Karnik, Rongrong Wang, and Mark Iwen
- Abstract summary: Current theory predicts that networks should scale exponentially in the dimension of the problem.
We provide a general method for bounding the complexity required for a neural network to approximate a H"older (or uniformly) continuous function.
- Score: 6.84380898679299
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The remarkable successes of neural networks in a huge variety of inverse
problems have fueled their adoption in disciplines ranging from medical imaging
to seismic analysis over the past decade. However, the high dimensionality of
such inverse problems has simultaneously left current theory, which predicts
that networks should scale exponentially in the dimension of the problem,
unable to explain why the seemingly small networks used in these settings work
as well as they do in practice. To reduce this gap between theory and practice,
we provide a general method for bounding the complexity required for a neural
network to approximate a H\"older (or uniformly) continuous function defined on
a high-dimensional set with a low-complexity structure. The approach is based
on the observation that the existence of a Johnson-Lindenstrauss embedding
$A\in\mathbb{R}^{d\times D}$ of a given high-dimensional set
$S\subset\mathbb{R}^D$ into a low dimensional cube $[-M,M]^d$ implies that for
any H\"older (or uniformly) continuous function $f:S\to\mathbb{R}^p$, there
exists a H\"older (or uniformly) continuous function
$g:[-M,M]^d\to\mathbb{R}^p$ such that $g(Ax)=f(x)$ for all $x\in S$. Hence, if
one has a neural network which approximates $g:[-M,M]^d\to\mathbb{R}^p$, then a
layer can be added that implements the JL embedding $A$ to obtain a neural
network that approximates $f:S\to\mathbb{R}^p$. By pairing JL embedding results
along with results on approximation of H\"older (or uniformly) continuous
functions by neural networks, one then obtains results which bound the
complexity required for a neural network to approximate H\"older (or uniformly)
continuous functions on high dimensional sets. The end result is a general
theoretical framework which can then be used to better explain the observed
empirical successes of smaller networks in a wider variety of inverse problems
than current theory allows.
Related papers
- New advances in universal approximation with neural networks of minimal width [4.424170214926035]
We show that autoencoders with leaky ReLU activations are universal approximators of $Lp$ functions.
We broaden our results to show that smooth invertible neural networks can approximate $Lp(mathbbRd,mathbbRd)$ on compacta.
arXiv Detail & Related papers (2024-11-13T16:17:16Z) - Sample Complexity of Neural Policy Mirror Descent for Policy
Optimization on Low-Dimensional Manifolds [75.51968172401394]
We study the sample complexity of the neural policy mirror descent (NPMD) algorithm with deep convolutional neural networks (CNN)
In each iteration of NPMD, both the value function and the policy can be well approximated by CNNs.
We show that NPMD can leverage the low-dimensional structure of state space to escape from the curse of dimensionality.
arXiv Detail & Related papers (2023-09-25T07:31:22Z) - Understanding Deep Neural Function Approximation in Reinforcement
Learning via $\epsilon$-Greedy Exploration [53.90873926758026]
This paper provides a theoretical study of deep neural function approximation in reinforcement learning (RL)
We focus on the value based algorithm with the $epsilon$-greedy exploration via deep (and two-layer) neural networks endowed by Besov (and Barron) function spaces.
Our analysis reformulates the temporal difference error in an $L2(mathrmdmu)$-integrable space over a certain averaged measure $mu$, and transforms it to a generalization problem under the non-iid setting.
arXiv Detail & Related papers (2022-09-15T15:42:47Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions.
We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z) - Large-width functional asymptotics for deep Gaussian neural networks [2.7561479348365734]
We consider fully connected feed-forward deep neural networks where weights and biases are independent and identically distributed according to Gaussian distributions.
Our results contribute to recent theoretical studies on the interplay between infinitely wide deep neural networks and processes.
arXiv Detail & Related papers (2021-02-20T10:14:37Z) - The universal approximation theorem for complex-valued neural networks [0.0]
We generalize the classical universal approximation for neural networks to the case of complex-valued neural networks.
We consider feedforward networks with a complex activation function $sigma : mathbbC to mathbbC$ in which each neuron performs the operation $mathbbCN to mathbbC, z mapsto sigma(b + wT z)$ with weights $w in mathbbCN$ and a bias $b in math
arXiv Detail & Related papers (2020-12-06T18:51:10Z) - On Function Approximation in Reinforcement Learning: Optimism in the
Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning.
In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function.
Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z) - On the Banach spaces associated with multi-layer ReLU networks: Function
representation, approximation theory and gradient descent dynamics [8.160343645537106]
We develop Banach spaces for ReLU neural networks of finite depth $L$ and infinite width.
The spaces contain all finite fully connected $L$-layer networks and their $L2$-limiting objects under on the natural path-norm.
Under this norm, the unit ball in the space for $L$-layer networks has low Rademacher complexity and thus favorable properties.
arXiv Detail & Related papers (2020-07-30T17:47:05Z) - Interval Universal Approximation for Neural Networks [47.767793120249095]
We introduce the interval universal approximation (IUA) theorem.
IUA shows that neural networks can approximate any continuous function $f$ as we have known for decades.
We study the computational complexity of constructing neural networks that are amenable to precise interval analysis.
arXiv Detail & Related papers (2020-07-12T20:43:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.