Related papers: Hessian-Free Laplace in Bayesian Deep Learning

Hessian-Free Laplace in Bayesian Deep Learning

URL: http://arxiv.org/abs/2403.10671v1
Date: Fri, 15 Mar 2024 20:47:39 GMT
Title: Hessian-Free Laplace in Bayesian Deep Learning
Authors: James McInerney, Nathan Kallus,
Abstract summary: Hessian-free Laplace (HFL) approximation uses curvature of both the log posterior and network prediction to estimate its variance. We show that, under standard assumptions of LA in Bayesian deep learning, HFL targets the same variance as LA, and can be efficiently amortized in a pre-trained network.
Score: 44.16006844888796
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The Laplace approximation (LA) of the Bayesian posterior is a Gaussian distribution centered at the maximum a posteriori estimate. Its appeal in Bayesian deep learning stems from the ability to quantify uncertainty post-hoc (i.e., after standard network parameter optimization), the ease of sampling from the approximate posterior, and the analytic form of model evidence. However, an important computational bottleneck of LA is the necessary step of calculating and inverting the Hessian matrix of the log posterior. The Hessian may be approximated in a variety of ways, with quality varying with a number of factors including the network, dataset, and inference task. In this paper, we propose an alternative framework that sidesteps Hessian calculation and inversion. The Hessian-free Laplace (HFL) approximation uses curvature of both the log posterior and network prediction to estimate its variance. Only two point estimates are needed: the standard maximum a posteriori parameter and the optimal parameter under a loss regularized by the network prediction. We show that, under standard assumptions of LA in Bayesian deep learning, HFL targets the same variance as LA, and can be efficiently amortized in a pre-trained network. Experiments demonstrate comparable performance to that of exact and approximate Hessians, with excellent coverage for in-between uncertainty.

Related papers

Enhancing Uncertainty Estimation and Interpretability via Bayesian Non-negative Decision Layer [55.66973223528494]
We develop a Bayesian Non-negative Decision Layer (BNDL), which reformulates deep neural networks as a conditional Bayesian non-negative factor analysis.<n>BNDL can model complex dependencies and provide robust uncertainty estimation.<n>We also offer theoretical guarantees that BNDL can achieve effective disentangled learning.
arXiv Detail & Related papers (2025-05-28T10:23:34Z)
Cooperative Bayesian and variance networks disentangle aleatoric and epistemic uncertainties [0.0]
Real-world data contains aleatoric uncertainty - irreducible noise arising from imperfect measurements or from incomplete knowledge about the data generation process.<n>Mean variance estimation (MVE) networks can learn this type of uncertainty but require ad-hoc regularization strategies to avoid overfitting.<n>We propose to train a variance network with a Bayesian neural network and demonstrate that the resulting model disentangles aleatoric and epistemic uncertainties while improving the mean estimation.
arXiv Detail & Related papers (2025-05-05T15:50:52Z)
Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP) For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z)
Confidence Intervals and Simultaneous Confidence Bands Based on Deep Learning [0.36832029288386137]
We provide a valid non-parametric bootstrap method that correctly disentangles data uncertainty from the noise inherent in the adopted optimization algorithm. The proposed ad-hoc method can be easily integrated into any deep neural network without interfering with the training process.
arXiv Detail & Related papers (2024-06-20T05:51:37Z)
Calibrating Neural Simulation-Based Inference with Differentiable Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model. By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation. It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z)
Uncertainty Estimation by Fisher Information-based Evidential Deep Learning [61.94125052118442]
Uncertainty estimation is a key factor that makes deep learning reliable in practical applications. We propose a novel method, Fisher Information-based Evidential Deep Learning ($mathcalI$-EDL) In particular, we introduce Fisher Information Matrix (FIM) to measure the informativeness of evidence carried by each sample, according to which we can dynamically reweight the objective loss terms to make the network more focused on the representation learning of uncertain classes.
arXiv Detail & Related papers (2023-03-03T16:12:59Z)
Model-Based Uncertainty in Value Functions [89.31922008981735]
We focus on characterizing the variance over values induced by a distribution over MDPs. Previous work upper bounds the posterior variance over values by solving a so-called uncertainty Bellman equation. We propose a new uncertainty Bellman equation whose solution converges to the true posterior variance over values.
arXiv Detail & Related papers (2023-02-24T09:18:27Z)
Improved uncertainty quantification for neural networks with Bayesian last layer [0.0]
Uncertainty quantification is an important task in machine learning. We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z)
NUQ: Nonparametric Uncertainty Quantification for Deterministic Neural Networks [151.03112356092575]
We show the principled way to measure the uncertainty of predictions for a classifier based on Nadaraya-Watson's nonparametric estimate of the conditional label distribution. We demonstrate the strong performance of the method in uncertainty estimation tasks on a variety of real-world image datasets.
arXiv Detail & Related papers (2022-02-07T12:30:45Z)
Gradient-Based Quantification of Epistemic Uncertainty for Deep Object Detectors [8.029049649310213]
We introduce novel gradient-based uncertainty metrics and investigate them for different object detection architectures. Experiments show significant improvements in true positive / false positive discrimination and prediction of intersection over union. We also find improvement over Monte-Carlo dropout uncertainty metrics and further significant boosts by aggregating different sources of uncertainty metrics.
arXiv Detail & Related papers (2021-07-09T16:04:11Z)
Variational Refinement for Importance Sampling Using the Forward Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference. Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures. We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z)
Sampling-free Variational Inference for Neural Networks with Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference. Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z)
Variational Laplace for Bayesian neural networks [25.055754094939527]
Variational Laplace exploits a local approximation of the likelihood to estimate the ELBO without the need for sampling the neural-network weights. We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
arXiv Detail & Related papers (2021-02-27T14:06:29Z)
The Bayesian Method of Tensor Networks [1.7894377200944511]
We study the Bayesian framework of the Network from two perspective. We study the Bayesian properties of the Network by visualizing the parameters of the model and the decision boundaries in the two dimensional synthetic data set.
arXiv Detail & Related papers (2021-01-01T14:59:15Z)
Uncertainty-Aware Deep Calibrated Salient Object Detection [74.58153220370527]
Existing deep neural network based salient object detection (SOD) methods mainly focus on pursuing high network accuracy. These methods overlook the gap between network accuracy and prediction confidence, known as the confidence uncalibration problem. We introduce an uncertaintyaware deep SOD network, and propose two strategies to prevent deep SOD networks from being overconfident.
arXiv Detail & Related papers (2020-12-10T23:28:36Z)
Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs [29.84563789289183]
Uncertainty quantification is crucial for building reliable and trustable machine learning systems. We propose to estimate uncertainty in recurrent neural networks (RNNs) via discrete state transitions over recurrent timesteps. The uncertainty of the model can be quantified by running a prediction several times, each time sampling from the recurrent state transition distribution.
arXiv Detail & Related papers (2020-11-24T10:35:28Z)
Variational Laplace for Bayesian neural networks [33.46810568687292]
We develop variational Laplace for Bayesian neural networks (BNNs) We exploit a local approximation of the curvature of the likelihood to estimate the ELBO without the need for sampling the neural-network weights. We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
arXiv Detail & Related papers (2020-11-20T15:16:18Z)
Bayesian Deep Learning via Subnetwork Inference [2.2835610890984164]
We show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets.
arXiv Detail & Related papers (2020-10-28T01:10:11Z)
Uncertainty Quantification in Deep Residual Neural Networks [0.0]
Uncertainty quantification is an important and challenging problem in deep learning. Previous methods rely on dropout layers which are not present in modern deep architectures or batch normalization which is sensitive to batch sizes. We show that training residual networks using depth can be interpreted as a variational approximation to the posterior weights in neural networks.
arXiv Detail & Related papers (2020-07-09T16:05:37Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks [65.24701908364383]
We show that a sufficient condition for a uncertainty on a ReLU network is "to be a bit Bayesian calibrated" We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.
arXiv Detail & Related papers (2020-02-24T08:52:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.