Related papers: Bayesian Deep Learning via Subnetwork Inference

Bayesian Deep Learning via Subnetwork Inference

URL: http://arxiv.org/abs/2010.14689v4
Date: Mon, 14 Mar 2022 13:46:12 GMT
Title: Bayesian Deep Learning via Subnetwork Inference
Authors: Erik Daxberger, Eric Nalisnick, James Urquhart Allingham, Javier Antor\'an, Jos\'e Miguel Hern\'andez-Lobato
Abstract summary: We show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets.
Score: 2.2835610890984164
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The Bayesian paradigm has the potential to solve core issues of deep neural networks such as poor calibration and data inefficiency. Alas, scaling Bayesian inference to large weight spaces often requires restrictive approximations. In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. The other weights are kept as point estimates. This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets. In particular, we implement subnetwork linearized Laplace as a simple, scalable Bayesian deep learning method: We first obtain a MAP estimate of all weights and then infer a full-covariance Gaussian posterior over a subnetwork using the linearized Laplace approximation. We propose a subnetwork selection strategy that aims to maximally preserve the model's predictive uncertainty. Empirically, our approach compares favorably to ensembles and less expressive posterior approximations over full networks. Our proposed subnetwork (linearized) Laplace method is implemented within the laplace PyTorch library at https://github.com/AlexImmer/Laplace.

Related papers

Posterior and variational inference for deep neural networks with heavy-tailed weights [0.0]
We consider deep neural networks in a Bayesian framework with a prior distribution sampling the network weights at random. We show that the corresponding posterior distribution achieves near-optimal minimax contraction rates. We also provide variational Bayes counterparts of the results, that show that mean-field variational approximations still benefit from near-optimal theoretical support.
arXiv Detail & Related papers (2024-06-05T15:24:20Z)
Hessian-Free Laplace in Bayesian Deep Learning [44.16006844888796]
Hessian-free Laplace (HFL) approximation uses curvature of both the log posterior and network prediction to estimate its variance. We show that, under standard assumptions of LA in Bayesian deep learning, HFL targets the same variance as LA, and can be efficiently amortized in a pre-trained network.
arXiv Detail & Related papers (2024-03-15T20:47:39Z)
Improved uncertainty quantification for neural networks with Bayesian last layer [0.0]
Uncertainty quantification is an important task in machine learning. We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z)
On the optimization and pruning for Bayesian deep learning [1.0152838128195467]
We propose a new adaptive variational Bayesian algorithm to train neural networks on weight space. The EM-MCMC algorithm allows us to perform optimization and model pruning within one-shot. Our dense model can reach the state-of-the-art performance and our sparse model perform very well compared to previously proposed pruning schemes.
arXiv Detail & Related papers (2022-10-24T05:18:08Z)
Content Popularity Prediction Based on Quantized Federated Bayesian Learning in Fog Radio Access Networks [76.16527095195893]
We investigate the content popularity prediction problem in cache-enabled fog radio access networks (F-RANs) In order to predict the content popularity with high accuracy and low complexity, we propose a Gaussian process based regressor to model the content request pattern. We utilize Bayesian learning to train the model parameters, which is robust to overfitting.
arXiv Detail & Related papers (2022-06-23T03:05:12Z)
Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs) PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
The Bayesian Method of Tensor Networks [1.7894377200944511]
We study the Bayesian framework of the Network from two perspective. We study the Bayesian properties of the Network by visualizing the parameters of the model and the decision boundaries in the two dimensional synthetic data set.
arXiv Detail & Related papers (2021-01-01T14:59:15Z)
Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN) Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one. We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z)
Disentangling the Gauss-Newton Method and Approximate Inference for Neural Networks [96.87076679064499]
We disentangle the generalized Gauss-Newton and approximate inference for Bayesian deep learning. We find that the Gauss-Newton method simplifies the underlying probabilistic model significantly. The connection to Gaussian processes enables new function-space inference algorithms.
arXiv Detail & Related papers (2020-07-21T17:42:58Z)
Making Affine Correspondences Work in Camera Geometry Computation [62.7633180470428]
Local features provide region-to-region rather than point-to-point correspondences. We propose guidelines for effective use of region-to-region matches in the course of a full model estimation pipeline. Experiments show that affine solvers can achieve accuracy comparable to point-based solvers at faster run-times.
arXiv Detail & Related papers (2020-07-20T12:07:48Z)
Fast Predictive Uncertainty for Classification with Bayesian Deep Networks [25.821401066200504]
In Bayesian Deep Learning, distributions over the output of classification neural networks are approximated by first constructing a Gaussian distribution over the weights, then sampling from it to receive a distribution over the softmax outputs. We construct a Dirichlet approximation of this softmax output distribution, which yields an analytic map between Gaussian distributions in logit space and Dirichlet distributions in the output space. We demonstrate that the resulting Dirichlet distribution has multiple advantages, in particular, more efficient of the uncertainty estimate and scaling to large datasets and networks like ImageNet and DenseNet.
arXiv Detail & Related papers (2020-03-02T22:29:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.