Accelerated replica exchange stochastic gradient Langevin diffusion
enhanced Bayesian DeepONet for solving noisy parametric PDEs
- URL: http://arxiv.org/abs/2111.02484v1
- Date: Wed, 3 Nov 2021 19:23:59 GMT
- Title: Accelerated replica exchange stochastic gradient Langevin diffusion
enhanced Bayesian DeepONet for solving noisy parametric PDEs
- Authors: Guang Lin, Christian Moya, Zecheng Zhang
- Abstract summary: We propose a training framework for replica-exchange Langevin diffusion that exploits the neural network architecture of DeepONets.
We show that the proposed framework's exploration and exploitation capabilities enable improved training convergence for DeepONets in noisy scenarios.
We also show that replica-exchange Langeving Diffusion also improves the DeepONet's mean prediction accuracy in noisy scenarios.
- Score: 7.337247167823921
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Deep Operator Networks~(DeepONet) is a fundamentally different class of
neural networks that we train to approximate nonlinear operators, including the
solution operator of parametric partial differential equations (PDE). DeepONets
have shown remarkable approximation and generalization capabilities even when
trained with relatively small datasets. However, the performance of DeepONets
deteriorates when the training data is polluted with noise, a scenario that
occurs very often in practice. To enable DeepONets training with noisy data, we
propose using the Bayesian framework of replica-exchange Langevin diffusion.
Such a framework uses two particles, one for exploring and another for
exploiting the loss function landscape of DeepONets. We show that the proposed
framework's exploration and exploitation capabilities enable (1) improved
training convergence for DeepONets in noisy scenarios and (2) attaching an
uncertainty estimate for the predicted solutions of parametric PDEs. In
addition, we show that replica-exchange Langeving Diffusion (remarkably) also
improves the DeepONet's mean prediction accuracy in noisy scenarios compared
with vanilla DeepONets trained with state-of-the-art gradient-based
optimization algorithms (e.g. Adam). To reduce the potentially high
computational cost of replica, in this work, we propose an accelerated training
framework for replica-exchange Langevin diffusion that exploits the neural
network architecture of DeepONets to reduce its computational cost up to 25%
without compromising the proposed framework's performance. Finally, we
illustrate the effectiveness of the proposed Bayesian framework using a series
of experiments on four parametric PDE problems.
Related papers
- Uncertainty-guided Optimal Transport in Depth Supervised Sparse-View 3D Gaussian [49.21866794516328]
3D Gaussian splatting has demonstrated impressive performance in real-time novel view synthesis.
Previous approaches have incorporated depth supervision into the training of 3D Gaussians to mitigate overfitting.
We introduce a novel method to supervise the depth distribution of 3D Gaussians, utilizing depth priors with integrated uncertainty estimates.
arXiv Detail & Related papers (2024-05-30T03:18:30Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Reliable extrapolation of deep neural operators informed by physics or
sparse observations [2.887258133992338]
Deep neural operators can learn nonlinear mappings between infinite-dimensional function spaces via deep neural networks.
DeepONets provide a new simulation paradigm in science and engineering.
We propose five reliable learning methods that guarantee a safe prediction under extrapolation.
arXiv Detail & Related papers (2022-12-13T03:02:46Z) - Semi-supervised Invertible DeepONets for Bayesian Inverse Problems [8.594140167290098]
DeepONets offer a powerful, data-driven tool for solving parametric PDEs by learning operators.
In this work, we employ physics-informed DeepONets in the context of high-dimensional, Bayesian inverse problems.
arXiv Detail & Related papers (2022-09-06T18:55:06Z) - Neural Basis Functions for Accelerating Solutions to High Mach Euler
Equations [63.8376359764052]
We propose an approach to solving partial differential equations (PDEs) using a set of neural networks.
We regress a set of neural networks onto a reduced order Proper Orthogonal Decomposition (POD) basis.
These networks are then used in combination with a branch network that ingests the parameters of the prescribed PDE to compute a reduced order approximation to the PDE.
arXiv Detail & Related papers (2022-08-02T18:27:13Z) - Variational Bayes Deep Operator Network: A data-driven Bayesian solver
for parametric differential equations [0.0]
We propose Variational Bayes DeepONet (VB-DeepONet) for operator learning.
VB-DeepONet uses variational inference to take into account high dimensional posterior distributions.
arXiv Detail & Related papers (2022-06-12T04:20:11Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - Improved architectures and training algorithms for deep operator
networks [0.0]
Operator learning techniques have emerged as a powerful tool for learning maps between infinite-dimensional Banach spaces.
We analyze the training dynamics of deep operator networks (DeepONets) through the lens of Neural Tangent Kernel (NTK) theory.
arXiv Detail & Related papers (2021-10-04T18:34:41Z) - Physics-aware deep neural networks for surrogate modeling of turbulent
natural convection [0.0]
We investigate the use of PINNs surrogate modeling for turbulent Rayleigh-B'enard convection flows.
We show how it comes to play as a regularization close to the training boundaries which are zones of poor accuracy for standard PINNs.
The predictive accuracy of the surrogate over the entire half a billion DNS coordinates yields errors for all flow variables ranging between [0.3% -- 4%] in the relative L 2 norm.
arXiv Detail & Related papers (2021-03-05T09:48:57Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - Learnable Bernoulli Dropout for Bayesian Deep Learning [53.79615543862426]
Learnable Bernoulli dropout (LBD) is a new model-agnostic dropout scheme that considers the dropout rates as parameters jointly optimized with other model parameters.
LBD leads to improved accuracy and uncertainty estimates in image classification and semantic segmentation.
arXiv Detail & Related papers (2020-02-12T18:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.