Related papers: Multi-fidelity Bayesian Neural Networks: Algorithms and Applications

Multi-fidelity Bayesian Neural Networks: Algorithms and Applications

URL: http://arxiv.org/abs/2012.13294v1
Date: Sat, 19 Dec 2020 02:03:53 GMT
Title: Multi-fidelity Bayesian Neural Networks: Algorithms and Applications
Authors: Xuhui Meng, Hessam Babaee, and George Em Karniadakis
Abstract summary: We propose a new class of Bayesian neural networks (BNNs) that can be trained using noisy data of variable fidelity. We apply them to learn function approximations as well as to solve inverse problems based on partial differential equations (PDEs)
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a new class of Bayesian neural networks (BNNs) that can be trained using noisy data of variable fidelity, and we apply them to learn function approximations as well as to solve inverse problems based on partial differential equations (PDEs). These multi-fidelity BNNs consist of three neural networks: The first is a fully connected neural network, which is trained following the maximum a posteriori probability (MAP) method to fit the low-fidelity data; the second is a Bayesian neural network employed to capture the cross-correlation with uncertainty quantification between the low- and high-fidelity data; and the last one is the physics-informed neural network, which encodes the physical laws described by PDEs. For the training of the last two neural networks, we use the Hamiltonian Monte Carlo method to estimate accurately the posterior distributions for the corresponding hyperparameters. We demonstrate the accuracy of the present method using synthetic data as well as real measurements. Specifically, we first approximate a one- and four-dimensional function, and then infer the reaction rates in one- and two-dimensional diffusion-reaction systems. Moreover, we infer the sea surface temperature (SST) in the Massachusetts and Cape Cod Bays using satellite images and in-situ measurements. Taken together, our results demonstrate that the present method can capture both linear and nonlinear correlation between the low- and high-fideilty data adaptively, identify unknown parameters in PDEs, and quantify uncertainties in predictions, given a few scattered noisy high-fidelity data. Finally, we demonstrate that we can effectively and efficiently reduce the uncertainties and hence enhance the prediction accuracy with an active learning approach, using as examples a specific one-dimensional function approximation and an inverse PDE problem.

Related papers

Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural Operators [1.0923877073891446]
Hamiltonian Monte Carlo (HMC) is a powerful and accurate method to sample from the posterior distribution in Bayesian networks.<n>We propose a hybrid approach that combines inexpensive VI and accurate HMC methods to efficiently accurately predict uncertainties in neural networks.
arXiv Detail & Related papers (2025-07-19T14:57:54Z)
Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks [0.5827521884806072]
Large neural networks trained on large datasets have become the dominant paradigm in machine learning. This thesis develops scalable methods to equip neural networks with model uncertainty.
arXiv Detail & Related papers (2024-04-29T23:38:58Z)
Assessing Neural Network Representations During Training Using Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process. We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures. We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z)
Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs. We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD. We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z)
Sparsifying Bayesian neural networks with latent binary variables and normalizing flows [10.865434331546126]
We will consider two extensions to the latent binary Bayesian neural networks (LBBNN) method. Firstly, by using the local reparametrization trick (LRT) to sample the hidden units directly, we get a more computationally efficient algorithm. More importantly, by using normalizing flows on the variational posterior distribution of the LBBNN parameters, the network learns a more flexible variational posterior distribution than the mean field Gaussian.
arXiv Detail & Related papers (2023-05-05T09:40:28Z)
Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective. We show how to compute this efficiently for tractable circuits. We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z)
Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations. This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z)
Assessments of model-form uncertainty using Gaussian stochastic weight averaging for fluid-flow regression [0.0]
We use Gaussian weight averaging (SWAG) to assess the model-form uncertainty associated with neural-network-based function approximation relevant to fluid flows. SWAG approximates a posterior Gaussian distribution of each weight, given training data, and a constant learning rate. We demonstrate the applicability of the method for two types of neural networks.
arXiv Detail & Related papers (2021-09-16T23:13:26Z)
Learning Functional Priors and Posteriors from Data and Physics [3.537267195871802]
We develop a new framework based on deep neural networks to be able to extrapolate in space-time using historical data. We employ the physics-informed Generative Adversarial Networks (PI-GAN) to learn a functional prior. At the second stage, we employ the Hamiltonian Monte Carlo (HMC) method to estimate the posterior in the latent space of PI-GANs.
arXiv Detail & Related papers (2021-06-08T03:03:24Z)
Physics-aware deep neural networks for surrogate modeling of turbulent natural convection [0.0]
We investigate the use of PINNs surrogate modeling for turbulent Rayleigh-B'enard convection flows. We show how it comes to play as a regularization close to the training boundaries which are zones of poor accuracy for standard PINNs. The predictive accuracy of the surrogate over the entire half a billion DNS coordinates yields errors for all flow variables ranging between [0.3% -- 4%] in the relative L 2 norm.
arXiv Detail & Related papers (2021-03-05T09:48:57Z)
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs) We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent. For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.