Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural Operators
- URL: http://arxiv.org/abs/2507.14652v1
- Date: Sat, 19 Jul 2025 14:57:54 GMT
- Title: Accelerating Hamiltonian Monte Carlo for Bayesian Inference in Neural Networks and Neural Operators
- Authors: Ponkrshnan Thiagarajan, Tamer A. Zaki, Michael D. Shields,
- Abstract summary: Hamiltonian Monte Carlo (HMC) is a powerful and accurate method to sample from the posterior distribution in Bayesian networks.<n>We propose a hybrid approach that combines inexpensive VI and accurate HMC methods to efficiently accurately predict uncertainties in neural networks.
- Score: 1.0923877073891446
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hamiltonian Monte Carlo (HMC) is a powerful and accurate method to sample from the posterior distribution in Bayesian inference. However, HMC techniques are computationally demanding for Bayesian neural networks due to the high dimensionality of the network's parameter space and the non-convexity of their posterior distributions. Therefore, various approximation techniques, such as variational inference (VI) or stochastic gradient MCMC, are often employed to infer the posterior distribution of the network parameters. Such approximations introduce inaccuracies in the inferred distributions, resulting in unreliable uncertainty estimates. In this work, we propose a hybrid approach that combines inexpensive VI and accurate HMC methods to efficiently and accurately quantify uncertainties in neural networks and neural operators. The proposed approach leverages an initial VI training on the full network. We examine the influence of individual parameters on the prediction uncertainty, which shows that a large proportion of the parameters do not contribute substantially to uncertainty in the network predictions. This information is then used to significantly reduce the dimension of the parameter space, and HMC is performed only for the subset of network parameters that strongly influence prediction uncertainties. This yields a framework for accelerating the full batch HMC for posterior inference in neural networks. We demonstrate the efficiency and accuracy of the proposed framework on deep neural networks and operator networks, showing that inference can be performed for large networks with tens to hundreds of thousands of parameters. We show that this method can effectively learn surrogates for complex physical systems by modeling the operator that maps from upstream conditions to wall-pressure data on a cone in hypersonic flow.
Related papers
- Understanding the Trade-offs in Accuracy and Uncertainty Quantification: Architecture and Inference Choices in Bayesian Neural Networks [0.276240219662896]
Despite promising theoretical results, the properties of even the most commonly used posterior approximations are often questioned.<n>The dimensions of modern deep models, coupled with the lack of identifiability, make Markov chain Monte Carlo (MCMC) unable to fully explore the posterior.<n> variational inference benefits from improved computational complexity but lacks the multimodalal guarantees of sampling-based inference.<n> stacking and ensembles of variational approximations provided comparable accuracy to MCMC at a much-reduced cost.
arXiv Detail & Related papers (2025-03-14T18:55:48Z) - On the Convergence of Locally Adaptive and Scalable Diffusion-Based Sampling Methods for Deep Bayesian Neural Network Posteriors [2.3265565167163906]
Bayesian neural networks are a promising approach for modeling uncertainties in deep neural networks.
generating samples from the posterior distribution of neural networks is a major challenge.
One advance in that direction would be the incorporation of adaptive step sizes into Monte Carlo Markov chain sampling algorithms.
In this paper, we demonstrate that these methods can have a substantial bias in the distribution they sample, even in the limit of vanishing step sizes and at full batch size.
arXiv Detail & Related papers (2024-03-13T15:21:14Z) - A Compact Representation for Bayesian Neural Networks By Removing
Permutation Symmetry [22.229664343428055]
We show that the role of permutations can be meaningfully quantified by a number of transpositions metric.
We then show that the recently proposed rebasin method allows us to summarize HMC samples into a compact representation.
We show that this compact representation allows us to compare trained BNNs directly in weight space across sampling methods and variational inference.
arXiv Detail & Related papers (2023-12-31T23:57:05Z) - Tractable Function-Space Variational Inference in Bayesian Neural
Networks [72.97620734290139]
A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters.
We propose a scalable function-space variational inference method that allows incorporating prior information.
We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks.
arXiv Detail & Related papers (2023-12-28T18:33:26Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Bayesian deep learning framework for uncertainty quantification in high
dimensions [6.282068591820945]
We develop a novel deep learning method for uncertainty quantification in partial differential equations based on Bayesian neural network (BNN) and Hamiltonian Monte Carlo (HMC)
A BNN efficiently learns the posterior distribution of the parameters in deep neural networks by performing Bayesian inference on the network parameters.
The posterior distribution is efficiently sampled using HMC to quantify uncertainties in the system.
arXiv Detail & Related papers (2022-10-21T05:20:06Z) - Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN)
VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers.
In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z) - NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural
Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution.
We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Multi-fidelity Bayesian Neural Networks: Algorithms and Applications [0.0]
We propose a new class of Bayesian neural networks (BNNs) that can be trained using noisy data of variable fidelity.
We apply them to learn function approximations as well as to solve inverse problems based on partial differential equations (PDEs)
arXiv Detail & Related papers (2020-12-19T02:03:53Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.