Approximation properties of Residual Neural Networks for Kolmogorov PDEs
- URL: http://arxiv.org/abs/2111.00215v1
- Date: Sat, 30 Oct 2021 09:28:49 GMT
- Title: Approximation properties of Residual Neural Networks for Kolmogorov PDEs
- Authors: Jonas Baggenstos and Diyora Salimova
- Abstract summary: We show that ResNets are able to approximate Kolmogorov partial differential equations with constant diffusion and possibly nonlinear gradient coefficients.
In contrast to FNNs, the Euler-Maruyama approximation structure of ResNets simplifies the construction of the approximating ResNets substantially.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years residual neural networks (ResNets) as introduced by [He, K.,
Zhang, X., Ren, S., and Sun, J., Proceedings of the IEEE conference on computer
vision and pattern recognition (2016), 770-778] have become very popular in a
large number of applications, including in image classification and
segmentation. They provide a new perspective in training very deep neural
networks without suffering the vanishing gradient problem. In this article we
show that ResNets are able to approximate solutions of Kolmogorov partial
differential equations (PDEs) with constant diffusion and possibly nonlinear
drift coefficients without suffering the curse of dimensionality, which is to
say the number of parameters of the approximating ResNets grows at most
polynomially in the reciprocal of the approximation accuracy $\varepsilon > 0$
and the dimension of the considered PDE $d\in\mathbb{N}$. We adapt a proof in
[Jentzen, A., Salimova, D., and Welti, T., Commun. Math. Sci. 19, 5 (2021),
1167-1205] - who showed a similar result for feedforward neural networks (FNNs)
- to ResNets. In contrast to FNNs, the Euler-Maruyama approximation structure
of ResNets simplifies the construction of the approximating ResNets
substantially. Moreover, contrary to the above work, in our proof using ResNets
does not require the existence of an FNN (or a ResNet) representing the
identity map, which enlarges the set of applicable activation functions.
Related papers
- GIT-Net: Generalized Integral Transform for Operator Learning [58.13313857603536]
This article introduces GIT-Net, a deep neural network architecture for approximating Partial Differential Equation (PDE) operators.
GIT-Net harnesses the fact that differential operators commonly used for defining PDEs can often be represented parsimoniously when expressed in specialized functional bases.
Numerical experiments demonstrate that GIT-Net is a competitive neural network operator, exhibiting small test errors and low evaluations across a range of PDE problems.
arXiv Detail & Related papers (2023-12-05T03:03:54Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - LordNet: An Efficient Neural Network for Learning to Solve Parametric Partial Differential Equations without Simulated Data [47.49194807524502]
We propose LordNet, a tunable and efficient neural network for modeling entanglements.
The experiments on solving Poisson's equation and (2D and 3D) Navier-Stokes equation demonstrate that the long-range entanglements can be well modeled by the LordNet.
arXiv Detail & Related papers (2022-06-19T14:41:08Z) - Edge Rewiring Goes Neural: Boosting Network Resilience via Policy
Gradient [62.660451283548724]
ResiNet is a reinforcement learning framework to discover resilient network topologies against various disasters and attacks.
We show that ResiNet achieves a near-optimal resilience gain on multiple graphs while balancing the utility, with a large margin compared to existing approaches.
arXiv Detail & Related papers (2021-10-18T06:14:28Z) - Overparameterization of deep ResNet: zero loss and mean-field analysis [19.45069138853531]
Finding parameters in a deep neural network (NN) that fit data is a non optimization problem.
We show that a basic first-order optimization method (gradient descent) finds a global solution with perfect fit in many practical situations.
We give estimates of the depth and width needed to reduce the loss below a given threshold, with high probability.
arXiv Detail & Related papers (2021-05-30T02:46:09Z) - Translating Numerical Concepts for PDEs into Neural Architectures [9.460896836770534]
We investigate what can be learned from translating numerical algorithms into neural networks.
On the numerical side, we consider explicit, accelerated explicit, and implicit schemes for a general higher order nonlinear diffusion equation in 1D.
On the neural network side, we identify corresponding concepts in terms of residual networks (ResNets), recurrent networks, and U-nets.
arXiv Detail & Related papers (2021-03-29T08:31:51Z) - Parametric Complexity Bounds for Approximating PDEs with Neural Networks [41.46028070204925]
We prove that when a PDE's coefficients are representable by small neural networks, the parameters required to approximate its solution scalely with the input $d$ are proportional to the parameter counts of the neural networks.
Our proof is based on constructing a neural network which simulates gradient descent in an appropriate space which converges to the solution of the PDE.
arXiv Detail & Related papers (2021-03-03T02:42:57Z) - Doubly infinite residual neural networks: a diffusion process approach [8.642603456626393]
We show that deep ResNets do not suffer from undesirable forward-propagation properties.
We focus on doubly infinite fully-connected ResNets, for which we consider i.i.d.
Our results highlight a limited expressive power of doubly infinite ResNets when the unscaled network's parameters are i.i.d. and the residual blocks are shallow.
arXiv Detail & Related papers (2020-07-07T07:45:34Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z) - On Random Kernels of Residual Architectures [93.94469470368988]
We derive finite width and depth corrections for the Neural Tangent Kernel (NTK) of ResNets and DenseNets.
Our findings show that in ResNets, convergence to the NTK may occur when depth and width simultaneously tend to infinity.
In DenseNets, however, convergence of the NTK to its limit as the width tends to infinity is guaranteed.
arXiv Detail & Related papers (2020-01-28T16:47:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.