Deep neural networks with ReLU, leaky ReLU, and softplus activation
provably overcome the curse of dimensionality for Kolmogorov partial
differential equations with Lipschitz nonlinearities in the $L^p$-sense
- URL: http://arxiv.org/abs/2309.13722v1
- Date: Sun, 24 Sep 2023 18:58:18 GMT
- Title: Deep neural networks with ReLU, leaky ReLU, and softplus activation
provably overcome the curse of dimensionality for Kolmogorov partial
differential equations with Lipschitz nonlinearities in the $L^p$-sense
- Authors: Julia Ackermann, Arnulf Jentzen, Thomas Kruse, Benno Kuckuck, Joshua
Lee Padgett
- Abstract summary: We show that deep neural networks (DNNs) have the expressive power to approximate PDE solutions without the curse of dimensionality (COD)
It is the key contribution of this work to generalize this result by establishing this statement in the $Lp$-sense with $pin(0,infty)$.
- Score: 3.3123773366516645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, several deep learning (DL) methods for approximating
high-dimensional partial differential equations (PDEs) have been proposed. The
interest that these methods have generated in the literature is in large part
due to simulations which appear to demonstrate that such DL methods have the
capacity to overcome the curse of dimensionality (COD) for PDEs in the sense
that the number of computational operations they require to achieve a certain
approximation accuracy $\varepsilon\in(0,\infty)$ grows at most polynomially in
the PDE dimension $d\in\mathbb N$ and the reciprocal of $\varepsilon$. While
there is thus far no mathematical result that proves that one of such methods
is indeed capable of overcoming the COD, there are now a number of rigorous
results in the literature that show that deep neural networks (DNNs) have the
expressive power to approximate PDE solutions without the COD in the sense that
the number of parameters used to describe the approximating DNN grows at most
polynomially in both the PDE dimension $d\in\mathbb N$ and the reciprocal of
the approximation accuracy $\varepsilon>0$. Roughly speaking, in the literature
it is has been proved for every $T>0$ that solutions $u_d\colon
[0,T]\times\mathbb R^d\to \mathbb R$, $d\in\mathbb N$, of semilinear heat PDEs
with Lipschitz continuous nonlinearities can be approximated by DNNs with ReLU
activation at the terminal time in the $L^2$-sense without the COD provided
that the initial value functions $\mathbb R^d\ni x\mapsto u_d(0,x)\in\mathbb
R$, $d\in\mathbb N$, can be approximated by ReLU DNNs without the COD. It is
the key contribution of this work to generalize this result by establishing
this statement in the $L^p$-sense with $p\in(0,\infty)$ and by allowing the
activation function to be more general covering the ReLU, the leaky ReLU, and
the softplus activation functions as special cases.
Related papers
- Multilevel Picard approximations and deep neural networks with ReLU, leaky ReLU, and softplus activation overcome the curse of dimensionality when approximating semilinear parabolic partial differential equations in $L^p$-sense [5.179504118679301]
We prove that multilevel Picard approximations and deep neural networks with ReLU, leaky ReLU, and softplus activation are capable of approximating solutions of Kolmogorov PDEs in $Lmathfrakp$-sense.
arXiv Detail & Related papers (2024-09-30T15:53:24Z) - Deep neural networks with ReLU, leaky ReLU, and softplus activation provably overcome the curse of dimensionality for space-time solutions of semilinear partial differential equations [3.3123773366516645]
It is a challenging topic in applied mathematics to solve high-dimensional nonlinear partial differential equations (PDEs)
Deep learning (DL) based methods for PDEs in which deep neural networks (DNNs) are used to approximate solutions of PDEs are presented.
arXiv Detail & Related papers (2024-06-16T09:59:29Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$ under isotropic Gaussian data.
We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ of arbitrary link function with a sample and runtime complexity of $n asymp T asymp C(q) cdot d
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - The necessity of depth for artificial neural networks to approximate
certain classes of smooth and bounded functions without the curse of
dimensionality [4.425982186154401]
We study high-dimensional approximation capacities of shallow and deep artificial neural networks (ANNs) with the rectified linear unit (ReLU) activation.
In particular, it is a key contribution of this work to reveal that for all $a,binmathbbR$ with $b-ageq 7$ we have that the functions $[a,b]dni x=(x_1,dots,x_d)mapstoprod_i=1d x_iinmathbbR$ for $d
arXiv Detail & Related papers (2023-01-19T19:52:41Z) - Neural Network Approximations of PDEs Beyond Linearity: A
Representational Perspective [40.964402478629495]
We take a step towards studying the representational power of neural networks for approximating solutions to nonlinear PDEs.
Treating a class of PDEs known as emphnonlinear elliptic variational PDEs, our results show neural networks can evade the curse of dimensionality.
arXiv Detail & Related papers (2022-10-21T16:53:18Z) - Deep neural network approximation of analytic functions [91.3755431537592]
entropy bound for the spaces of neural networks with piecewise linear activation functions.
We derive an oracle inequality for the expected error of the considered penalized deep neural network estimators.
arXiv Detail & Related papers (2021-04-05T18:02:04Z) - On Function Approximation in Reinforcement Learning: Optimism in the
Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning.
In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function.
Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z) - Space-time deep neural network approximations for high-dimensional partial differential equations [3.6185342807265415]
Deep learning approximations might have the capacity to overcome the curse of dimensionality.
This article proves for every $ainmathbbR$, $ bin (a,infty)$ that solutions of certain Kolmogorov PDEs can be approximated by DNNs without the curse of dimensionality.
arXiv Detail & Related papers (2020-06-03T12:14:56Z) - Reinforcement Learning with General Value Function Approximation:
Provably Efficient Approach via Bounded Eluder Dimension [124.7752517531109]
We establish a provably efficient reinforcement learning algorithm with general value function approximation.
We show that our algorithm achieves a regret bound of $widetildeO(mathrmpoly(dH)sqrtT)$ where $d$ is a complexity measure.
Our theory generalizes recent progress on RL with linear value function approximation and does not make explicit assumptions on the model of the environment.
arXiv Detail & Related papers (2020-05-21T17:36:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.