Related papers: Theory-to-Practice Gap for Neural Networks and Neural Operators

Theory-to-Practice Gap for Neural Networks and Neural Operators

URL: http://arxiv.org/abs/2503.18219v1
Date: Sun, 23 Mar 2025 21:45:58 GMT
Title: Theory-to-Practice Gap for Neural Networks and Neural Operators
Authors: Philipp Grohs, Samuel Lanthaler, Margaret Trautner,
Abstract summary: We study the sampling complexity of learning with ReLU neural networks and neural operators.<n>We show that the best-possible convergence rate in a Bochner $Lp$-norm is bounded by Monte-Carlo rates of order $1/p$.
Score: 6.267574471145217
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work studies the sampling complexity of learning with ReLU neural networks and neural operators. For mappings belonging to relevant approximation spaces, we derive upper bounds on the best-possible convergence rate of any learning algorithm, with respect to the number of samples. In the finite-dimensional case, these bounds imply a gap between the parametric and sampling complexities of learning, known as the \emph{theory-to-practice gap}. In this work, a unified treatment of the theory-to-practice gap is achieved in a general $L^p$-setting, while at the same time improving available bounds in the literature. Furthermore, based on these results the theory-to-practice gap is extended to the infinite-dimensional setting of operator learning. Our results apply to Deep Operator Networks and integral kernel-based neural operators, including the Fourier neural operator. We show that the best-possible convergence rate in a Bochner $L^p$-norm is bounded by Monte-Carlo rates of order $1/p$.

Related papers

Neural Operators with Localized Integral and Differential Kernels [77.76991758980003]
We present a principled approach to operator learning that can capture local features under two frameworks. We prove that we obtain differential operators under an appropriate scaling of the kernel values of CNNs. To obtain local integral operators, we utilize suitable basis representations for the kernels based on discrete-continuous convolutions.
arXiv Detail & Related papers (2024-02-26T18:59:31Z)
The Parametric Complexity of Operator Learning [5.756283466216181]
This paper is to prove that for general classes of operators which are characterized only by their $Cr$- or Lipschitz-regularity, operator learning suffers from a "curse of parametric complexity" The second contribution of the paper is to prove that this general curse can be overcome for solution operators defined by the Hamilton-Jacobi equation. A novel neural operator architecture is introduced, termed HJ-Net, which explicitly takes into account characteristic information of the underlying Hamiltonian system.
arXiv Detail & Related papers (2023-06-28T05:02:03Z)
Neural Operator: Learning Maps Between Function Spaces [75.93843876663128]
We propose a generalization of neural networks to learn operators, termed neural operators, that map between infinite dimensional function spaces. We prove a universal approximation theorem for our proposed neural operator, showing that it can approximate any given nonlinear continuous operator. An important application for neural operators is learning surrogate maps for the solution operators of partial differential equations.
arXiv Detail & Related papers (2021-08-19T03:56:49Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Proof of the Theory-to-Practice Gap in Deep Learning via Sampling Complexity bounds for Neural Network Approximation Spaces [5.863264019032882]
We study the computational complexity of (deterministic or randomized) algorithms based on approximating or integrating functions. One of the most important problems in this field concerns the question of whether it is possible to realize theoretically provable neural network approximation rates.
arXiv Detail & Related papers (2021-04-06T18:55:20Z)
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks [56.084798078072396]
We take a step towards closing the gap between theory and practice by significantly improving the known theoretical bounds on both the network width and the convergence time. We show that convergence to a global minimum is guaranteed for networks with quadratic widths in the sample size and linear in their depth at a time logarithmic in both. Our analysis and convergence bounds are derived via the construction of a surrogate network with fixed activation patterns that can be transformed at any time to an equivalent ReLU network of a reasonable size.
arXiv Detail & Related papers (2021-01-12T00:40:45Z)
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces [208.67848059021915]
We study the exploration-exploitation tradeoff at the core of reinforcement learning. In particular, we prove that the complexity of the function class $mathcalF$ characterizes the complexity of the function. Our regret bounds are independent of the number of episodes.
arXiv Detail & Related papers (2020-11-09T18:32:22Z)
Neural Networks and Quantum Field Theory [0.0]
We propose a theoretical understanding of neural networks in terms of Wilsonian effective field theory. The correspondence relies on the fact that many neural networks are drawn from Gaussian processes.
arXiv Detail & Related papers (2020-08-19T18:00:06Z)
Random Vector Functional Link Networks for Function Approximation on Manifolds [8.535815777849786]
We show that single layer neural-networks with random input-to-hidden layer weights and biases have seen success in practice. We further adapt this randomized neural network architecture to approximate functions on smooth, compact submanifolds of Euclidean space.
arXiv Detail & Related papers (2020-07-30T23:50:44Z)
Self-Organized Operational Neural Networks with Generative Neurons [87.32169414230822]
ONNs are heterogenous networks with a generalized neuron model that can encapsulate any set of non-linear operators. We propose Self-organized ONNs (Self-ONNs) with generative neurons that have the ability to adapt (optimize) the nodal operator of each connection.
arXiv Detail & Related papers (2020-04-24T14:37:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.