Sparsifying Bayesian neural networks with latent binary variables and
normalizing flows
- URL: http://arxiv.org/abs/2305.03395v1
- Date: Fri, 5 May 2023 09:40:28 GMT
- Title: Sparsifying Bayesian neural networks with latent binary variables and
normalizing flows
- Authors: Lars Skaaret-Lund, Geir Storvik, Aliaksandr Hubin
- Abstract summary: We will consider two extensions to the latent binary Bayesian neural networks (LBBNN) method.
Firstly, by using the local reparametrization trick (LRT) to sample the hidden units directly, we get a more computationally efficient algorithm.
More importantly, by using normalizing flows on the variational posterior distribution of the LBBNN parameters, the network learns a more flexible variational posterior distribution than the mean field Gaussian.
- Score: 10.865434331546126
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Artificial neural networks (ANNs) are powerful machine learning methods used
in many modern applications such as facial recognition, machine translation,
and cancer diagnostics. A common issue with ANNs is that they usually have
millions or billions of trainable parameters, and therefore tend to overfit to
the training data. This is especially problematic in applications where it is
important to have reliable uncertainty estimates. Bayesian neural networks
(BNN) can improve on this, since they incorporate parameter uncertainty. In
addition, latent binary Bayesian neural networks (LBBNN) also take into account
structural uncertainty by allowing the weights to be turned on or off, enabling
inference in the joint space of weights and structures. In this paper, we will
consider two extensions to the LBBNN method: Firstly, by using the local
reparametrization trick (LRT) to sample the hidden units directly, we get a
more computationally efficient algorithm. More importantly, by using
normalizing flows on the variational posterior distribution of the LBBNN
parameters, the network learns a more flexible variational posterior
distribution than the mean field Gaussian. Experimental results show that this
improves predictive power compared to the LBBNN method, while also obtaining
more sparse networks. We perform two simulation studies. In the first study, we
consider variable selection in a logistic regression setting, where the more
flexible variational distribution leads to improved results. In the second
study, we compare predictive uncertainty based on data generated from
two-dimensional Gaussian distributions. Here, we argue that our Bayesian
methods lead to more realistic estimates of predictive uncertainty.
Related papers
- Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Improved uncertainty quantification for neural networks with Bayesian
last layer [0.0]
Uncertainty quantification is an important task in machine learning.
We present a reformulation of the log-marginal likelihood of a NN with BLL which allows for efficient training using backpropagation.
arXiv Detail & Related papers (2023-02-21T20:23:56Z) - Variational Inference on the Final-Layer Output of Neural Networks [3.146069168382982]
This paper proposes to combine the advantages of both approaches by performing Variational Inference in the Final layer Output space (VIFO)
We use neural networks to learn the mean and the variance of the probabilistic output.
Experiments show that VIFO provides a good tradeoff in terms of run time and uncertainty quantification, especially for out of distribution data.
arXiv Detail & Related papers (2023-02-05T16:19:01Z) - Constraining cosmological parameters from N-body simulations with
Variational Bayesian Neural Networks [0.0]
Multiplicative normalizing flows (MNFs) are a family of approximate posteriors for the parameters of BNNs.
We have compared MNFs with respect to the standard BNNs, and the flipout estimator.
MNFs provide more realistic predictive distribution closer to the true posterior mitigating the bias introduced by the variational approximation.
arXiv Detail & Related papers (2023-01-09T16:07:48Z) - Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer [77.78479877473899]
We design a spatial-temporal-fusion BNN for efficiently scaling BNNs to large models.
Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently.
arXiv Detail & Related papers (2021-12-12T17:13:14Z) - Kalman Bayesian Neural Networks for Closed-form Online Learning [5.220940151628734]
We propose a novel approach for BNN learning via closed-form Bayesian inference.
The calculation of the predictive distribution of the output and the update of the weight distribution are treated as Bayesian filtering and smoothing problems.
This allows closed-form expressions for training the network's parameters in a sequential/online fashion without gradient descent.
arXiv Detail & Related papers (2021-10-03T07:29:57Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Improving Uncertainty Calibration via Prior Augmented Data [56.88185136509654]
Neural networks have proven successful at learning from complex data distributions by acting as universal function approximators.
They are often overconfident in their predictions, which leads to inaccurate and miscalibrated probabilistic predictions.
We propose a solution by seeking out regions of feature space where the model is unjustifiably overconfident, and conditionally raising the entropy of those predictions towards that of the prior distribution of the labels.
arXiv Detail & Related papers (2021-02-22T07:02:37Z) - Multi-fidelity Bayesian Neural Networks: Algorithms and Applications [0.0]
We propose a new class of Bayesian neural networks (BNNs) that can be trained using noisy data of variable fidelity.
We apply them to learn function approximations as well as to solve inverse problems based on partial differential equations (PDEs)
arXiv Detail & Related papers (2020-12-19T02:03:53Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.