Related papers: Variational Laplace for Bayesian neural networks

Variational Laplace for Bayesian neural networks

URL: http://arxiv.org/abs/2103.00222v1
Date: Sat, 27 Feb 2021 14:06:29 GMT
Title: Variational Laplace for Bayesian neural networks
Authors: Ali Unlu, Laurence Aitchison
Abstract summary: Variational Laplace exploits a local approximation of the likelihood to estimate the ELBO without the need for sampling the neural-network weights. We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
Score: 25.055754094939527
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We develop variational Laplace for Bayesian neural networks (BNNs) which exploits a local approximation of the curvature of the likelihood to estimate the ELBO without the need for stochastic sampling of the neural-network weights. Variational Laplace performs better on image classification tasks than MAP inference and far better than standard variational inference with stochastic sampling despite using the same mean-field Gaussian approximate posterior. The Variational Laplace objective is simple to evaluate, as it is (in essence) the log-likelihood, plus weight-decay, plus a squared-gradient regularizer. Finally, we emphasise care needed in benchmarking standard VI as there is a risk of stopping before the variance parameters have converged. We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.

Related papers

Neural Optimal Transport Meets Multivariate Conformal Prediction [58.43397908730771]
We propose a framework for conditional vectorile regression (CVQR)<n>CVQR combines neural optimal transport with quantized optimization, and apply it to predictions.
arXiv Detail & Related papers (2025-09-29T19:50:19Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Improved uncertainty quantification for neural networks with Bayesian last layer [0.0]
Uncertainty quantification is an important task in machine learning. We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z)
Differentiable Annealed Importance Sampling and the Perils of Gradient Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation. Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective. We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z)
Differentially private training of neural networks with Langevin dynamics forcalibrated predictive uncertainty [58.730520380312676]
We show that differentially private gradient descent (DP-SGD) can yield poorly calibrated, overconfident deep learning models. This represents a serious issue for safety-critical applications, e.g. in medical diagnosis.
arXiv Detail & Related papers (2021-07-09T08:14:45Z)
Sampling-free Variational Inference for Neural Networks with Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference. Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z)
Variational Laplace for Bayesian neural networks [33.46810568687292]
We develop variational Laplace for Bayesian neural networks (BNNs) We exploit a local approximation of the curvature of the likelihood to estimate the ELBO without the need for sampling the neural-network weights. We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
arXiv Detail & Related papers (2020-11-20T15:16:18Z)
Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN) Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one. We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z)
Bayesian Neural Network via Stochastic Gradient Descent [0.0]
We show how gradient estimation can be applied on bayesian neural networks by gradient estimation techniques. Our work considerably beats the previous state of the art approaches for regression using bayesian neural networks.
arXiv Detail & Related papers (2020-06-04T18:33:59Z)
The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks [46.677567663908185]
Variational Bayesian Inference is a popular methodology for approxing posteriorimating over Bayesian neural network weights. Recent work has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. We find that by decomposing these variational parameters into a low-rank factorization, we can make our variational approximation more compact without decreasing the models' performance.
arXiv Detail & Related papers (2020-02-07T07:33:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.