Improving predictions of Bayesian neural nets via local linearization
- URL: http://arxiv.org/abs/2008.08400v3
- Date: Thu, 25 Feb 2021 17:59:47 GMT
- Title: Improving predictions of Bayesian neural nets via local linearization
- Authors: Alexander Immer, Maciej Korzepa, Matthias Bauer
- Abstract summary: We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
- Score: 79.21517734364093
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The generalized Gauss-Newton (GGN) approximation is often used to make
practical Bayesian deep learning approaches scalable by replacing a second
order derivative with a product of first order derivatives. In this paper we
argue that the GGN approximation should be understood as a local linearization
of the underlying Bayesian neural network (BNN), which turns the BNN into a
generalized linear model (GLM). Because we use this linearized model for
posterior inference, we should also predict using this modified model instead
of the original one. We refer to this modified predictive as "GLM predictive"
and show that it effectively resolves common underfitting problems of the
Laplace approximation. It extends previous results in this vein to general
likelihoods and has an equivalent Gaussian process formulation, which enables
alternative inference schemes for BNNs in function space. We demonstrate the
effectiveness of our approach on several standard classification datasets as
well as on out-of-distribution detection. We provide an implementation at
https://github.com/AlexImmer/BNN-predictions.
Related papers
- Sparsifying Bayesian neural networks with latent binary variables and
normalizing flows [10.865434331546126]
We will consider two extensions to the latent binary Bayesian neural networks (LBBNN) method.
Firstly, by using the local reparametrization trick (LRT) to sample the hidden units directly, we get a more computationally efficient algorithm.
More importantly, by using normalizing flows on the variational posterior distribution of the LBBNN parameters, the network learns a more flexible variational posterior distribution than the mean field Gaussian.
arXiv Detail & Related papers (2023-05-05T09:40:28Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP)
Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN.
It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z) - Improved uncertainty quantification for neural networks with Bayesian
last layer [0.0]
Uncertainty quantification is an important task in machine learning.
We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z) - Variational Laplace for Bayesian neural networks [25.055754094939527]
Variational Laplace exploits a local approximation of the likelihood to estimate the ELBO without the need for sampling the neural-network weights.
We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
arXiv Detail & Related papers (2021-02-27T14:06:29Z) - Exploring the Uncertainty Properties of Neural Networks' Implicit Priors
in the Infinite-Width Limit [47.324627920761685]
We use recent theoretical advances that characterize the function-space prior to an ensemble of infinitely-wide NNs as a Gaussian process.
This gives us a better understanding of the implicit prior NNs place on function space.
We also examine the calibration of previous approaches to classification with the NNGP.
arXiv Detail & Related papers (2020-10-14T18:41:54Z) - Interpreting Graph Neural Networks for NLP With Differentiable Edge
Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models.
We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges.
We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z) - Disentangling the Gauss-Newton Method and Approximate Inference for
Neural Networks [96.87076679064499]
We disentangle the generalized Gauss-Newton and approximate inference for Bayesian deep learning.
We find that the Gauss-Newton method simplifies the underlying probabilistic model significantly.
The connection to Gaussian processes enables new function-space inference algorithms.
arXiv Detail & Related papers (2020-07-21T17:42:58Z) - Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
One-hidden-layer Case [93.37576644429578]
Graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice.
We provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems.
arXiv Detail & Related papers (2020-06-25T00:45:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.