Optimal Convergence Rates for Neural Operators
- URL: http://arxiv.org/abs/2412.17518v1
- Date: Mon, 23 Dec 2024 12:31:38 GMT
- Title: Optimal Convergence Rates for Neural Operators
- Authors: Mike Nguyen, Nicole Mücke,
- Abstract summary: We provide bounds on the number of hidden neurons and the number of second-stage samples necessary for generalization.
A key application of neural operators is learning surrogate maps for the solution operators of partial differential equations.
- Score: 2.9388890036358104
- License:
- Abstract: We introduce the neural tangent kernel (NTK) regime for two-layer neural operators and analyze their generalization properties. For early-stopped gradient descent (GD), we derive fast convergence rates that are known to be minimax optimal within the framework of non-parametric regression in reproducing kernel Hilbert spaces (RKHS). We provide bounds on the number of hidden neurons and the number of second-stage samples necessary for generalization. To justify our NTK regime, we additionally show that any operator approximable by a neural operator can also be approximated by an operator from the RKHS. A key application of neural operators is learning surrogate maps for the solution operators of partial differential equations (PDEs). We consider the standard Poisson equation to illustrate our theoretical findings with simulations.
Related papers
- Convergence analysis of wide shallow neural operators within the framework of Neural Tangent Kernel [4.313136216120379]
We conduct the convergence analysis of gradient descent for the wide shallow neural operators and physics-informed shallow neural operators within the framework of Neural Tangent Kernel (NTK)
Under the setting of over-parametrization, gradient descent can find the global minimum regardless of whether it is in continuous time or discrete time.
arXiv Detail & Related papers (2024-12-07T05:47:28Z) - Score-based Neural Ordinary Differential Equations for Computing Mean Field Control Problems [13.285775352653546]
This paper proposes a system of neural differential equations representing first- and second-order score functions along trajectories based on deep neural networks.
We reformulate the mean viscous field control (MFC) problem with individual noises into an unconstrained optimization problem framed by the proposed neural ODE system.
arXiv Detail & Related papers (2024-09-24T21:45:55Z) - Neural Operators with Localized Integral and Differential Kernels [77.76991758980003]
We present a principled approach to operator learning that can capture local features under two frameworks.
We prove that we obtain differential operators under an appropriate scaling of the kernel values of CNNs.
To obtain local integral operators, we utilize suitable basis representations for the kernels based on discrete-continuous convolutions.
arXiv Detail & Related papers (2024-02-26T18:59:31Z) - How many Neurons do we need? A refined Analysis for Shallow Networks
trained with Gradient Descent [0.0]
We analyze the generalization properties of two-layer neural networks in the neural tangent kernel regime.
We derive fast rates of convergence that are known to be minimax optimal in the framework of non-parametric regression.
arXiv Detail & Related papers (2023-09-14T22:10:28Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Neural Operator: Learning Maps Between Function Spaces [75.93843876663128]
We propose a generalization of neural networks to learn operators, termed neural operators, that map between infinite dimensional function spaces.
We prove a universal approximation theorem for our proposed neural operator, showing that it can approximate any given nonlinear continuous operator.
An important application for neural operators is learning surrogate maps for the solution operators of partial differential equations.
arXiv Detail & Related papers (2021-08-19T03:56:49Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z) - Optimal Rates for Averaged Stochastic Gradient Descent under Neural
Tangent Kernel Regime [50.510421854168065]
We show that the averaged gradient descent can achieve the minimax optimal convergence rate.
We show that the target function specified by the NTK of a ReLU network can be learned at the optimal convergence rate.
arXiv Detail & Related papers (2020-06-22T14:31:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.