Out-of-distributional risk bounds for neural operators with applications
to the Helmholtz equation
- URL: http://arxiv.org/abs/2301.11509v3
- Date: Tue, 4 Jul 2023 22:42:47 GMT
- Title: Out-of-distributional risk bounds for neural operators with applications
to the Helmholtz equation
- Authors: J. Antonio Lara Benitez, Takashi Furuya, Florian Faucher, Anastasis
Kratsios, Xavier Tricoche, Maarten V. de Hoop
- Abstract summary: Existing neural operators (NOs) do not necessarily perform well for all physics problems.
We propose a subfamily of NOs enabling an enhanced empirical approximation of the nonlinear operator mapping wave speed to solution.
Our experiments reveal certain surprises in the generalization and the relevance of introducing depth.
We conclude by proposing a hypernetwork version of the subfamily of NOs as a surrogate model for the mentioned forward operator.
- Score: 6.296104145657063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite their remarkable success in approximating a wide range of operators
defined by PDEs, existing neural operators (NOs) do not necessarily perform
well for all physics problems. We focus here on high-frequency waves to
highlight possible shortcomings. To resolve these, we propose a subfamily of
NOs enabling an enhanced empirical approximation of the nonlinear operator
mapping wave speed to solution, or boundary values for the Helmholtz equation
on a bounded domain. The latter operator is commonly referred to as the
''forward'' operator in the study of inverse problems. Our methodology draws
inspiration from transformers and techniques such as stochastic depth. Our
experiments reveal certain surprises in the generalization and the relevance of
introducing stochastic depth. Our NOs show superior performance as compared
with standard NOs, not only for testing within the training distribution but
also for out-of-distribution scenarios. To delve into this observation, we
offer an in-depth analysis of the Rademacher complexity associated with our
modified models and prove an upper bound tied to their stochastic depth that
existing NOs do not satisfy. Furthermore, we obtain a novel out-of-distribution
risk bound tailored to Gaussian measures on Banach spaces, again relating
stochastic depth with the bound. We conclude by proposing a hypernetwork
version of the subfamily of NOs as a surrogate model for the mentioned forward
operator.
Related papers
- Optimal Convergence Rates for Neural Operators [2.9388890036358104]
We provide bounds on the number of hidden neurons and the number of second-stage samples necessary for generalization.
A key application of neural operators is learning surrogate maps for the solution operators of partial differential equations.
arXiv Detail & Related papers (2024-12-23T12:31:38Z) - Taming Nonconvex Stochastic Mirror Descent with General Bregman
Divergence [25.717501580080846]
This paper revisits the convergence of gradient Forward Mirror (SMD) in the contemporary non optimization setting.
For the training, we develop provably convergent algorithms for the problem of linear networks.
arXiv Detail & Related papers (2024-02-27T17:56:49Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Model-Based Uncertainty in Value Functions [89.31922008981735]
We focus on characterizing the variance over values induced by a distribution over MDPs.
Previous work upper bounds the posterior variance over values by solving a so-called uncertainty Bellman equation.
We propose a new uncertainty Bellman equation whose solution converges to the true posterior variance over values.
arXiv Detail & Related papers (2023-02-24T09:18:27Z) - Nonlinear Reconstruction for Operator Learning of PDEs with
Discontinuities [5.735035463793008]
A large class of hyperbolic and advection-dominated PDEs can have solutions with discontinuities.
We rigorously prove, in terms of lower approximation bounds, that methods which entail a linear reconstruction step fail to efficiently approximate the solution operator of such PDEs.
We show that certain methods employing a non-linear reconstruction mechanism can overcome these fundamental lower bounds and approximate the underlying operator efficiently.
arXiv Detail & Related papers (2022-10-03T16:47:56Z) - Semi-supervised Invertible DeepONets for Bayesian Inverse Problems [8.594140167290098]
DeepONets offer a powerful, data-driven tool for solving parametric PDEs by learning operators.
In this work, we employ physics-informed DeepONets in the context of high-dimensional, Bayesian inverse problems.
arXiv Detail & Related papers (2022-09-06T18:55:06Z) - Approximate Bayesian Neural Operators: Uncertainty Quantification for
Parametric PDEs [34.179984253109346]
We provide a mathematically detailed Bayesian formulation of the ''shallow'' (linear) version of neural operators.
We then extend this analytic treatment to general deep neural operators using approximate methods from Bayesian deep learning.
As a result, our approach is able to identify cases, and provide structured uncertainty estimates, where the neural operator fails to predict well.
arXiv Detail & Related papers (2022-08-02T16:10:27Z) - Learning Dynamical Systems via Koopman Operator Regression in
Reproducing Kernel Hilbert Spaces [52.35063796758121]
We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical system.
We link the risk with the estimation of the spectral decomposition of the Koopman operator.
Our results suggest RRR might be beneficial over other widely used estimators.
arXiv Detail & Related papers (2022-05-27T14:57:48Z) - A Unifying Theory of Thompson Sampling for Continuous Risk-Averse
Bandits [91.3755431537592]
This paper unifies the analysis of risk-averse Thompson sampling algorithms for the multi-armed bandit problem.
Using the contraction principle in the theory of large deviations, we prove novel concentration bounds for continuous risk functionals.
We show that a wide class of risk functionals as well as "nice" functions of them satisfy the continuity condition.
arXiv Detail & Related papers (2021-08-25T17:09:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.