BNNpriors: A library for Bayesian neural network inference with
different prior distributions
- URL: http://arxiv.org/abs/2105.06964v1
- Date: Fri, 14 May 2021 17:11:04 GMT
- Title: BNNpriors: A library for Bayesian neural network inference with
different prior distributions
- Authors: Vincent Fortuin, Adri\`a Garriga-Alonso, Mark van der Wilk, Laurence
Aitchison
- Abstract summary: BNNpriors enables state-of-the-art Markov Chain Monte Carlo inference on Bayesian neural networks.
It follows a modular approach that eases the design and implementation of new custom priors.
It has facilitated foundational discoveries on the nature of the cold posterior effect in Bayesian neural networks.
- Score: 32.944046414823916
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian neural networks have shown great promise in many applications where
calibrated uncertainty estimates are crucial and can often also lead to a
higher predictive performance. However, it remains challenging to choose a good
prior distribution over their weights. While isotropic Gaussian priors are
often chosen in practice due to their simplicity, they do not reflect our true
prior beliefs well and can lead to suboptimal performance. Our new library,
BNNpriors, enables state-of-the-art Markov Chain Monte Carlo inference on
Bayesian neural networks with a wide range of predefined priors, including
heavy-tailed ones, hierarchical ones, and mixture priors. Moreover, it follows
a modular approach that eases the design and implementation of new custom
priors. It has facilitated foundational discoveries on the nature of the cold
posterior effect in Bayesian neural networks and will hopefully catalyze future
research as well as practical applications in this area.
Related papers
- Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP)
For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z) - Bayesian Neural Networks with Domain Knowledge Priors [52.80929437592308]
We propose a framework for integrating general forms of domain knowledge into a BNN prior.
We show that BNNs using our proposed domain knowledge priors outperform those with standard priors.
arXiv Detail & Related papers (2024-02-20T22:34:53Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - Hybrid Bayesian Neural Networks with Functional Probabilistic Layers [0.6091702876917281]
We propose hybrid Bayesian neural networks with functional probabilistic layers that encode function uncertainty.
We perform few proof-of-concept experiments using GPflus, a new library that provides Gaussian process layers.
arXiv Detail & Related papers (2021-07-14T21:25:53Z) - Precise characterization of the prior predictive distribution of deep
ReLU networks [45.46732383818331]
We derive a precise characterization of the prior predictive distribution of finite-width ReLU networks with Gaussian weights.
Our results provide valuable guidance on prior design, for instance, controlling the predictive variance with depth- and width-informed priors on the weights of the network.
arXiv Detail & Related papers (2021-06-11T21:21:52Z) - Bayesian Neural Network Priors Revisited [29.949163519715952]
We study summary statistics of neural network weights in different networks trained using SGD.
We find that fully connected networks (FCNNs) display heavy-tailed weight distributions, while convolutional neural network (CNN) weights display strong spatial correlations.
arXiv Detail & Related papers (2021-02-12T15:18:06Z) - All You Need is a Good Functional Prior for Bayesian Deep Learning [15.10662960548448]
We argue that this is a hugely limiting aspect of Bayesian deep learning.
We propose a novel and robust framework to match their prior with the functional prior of neural networks.
We provide vast experimental evidence that coupling these priors with scalable Markov chain Monte Carlo sampling offers systematically large performance improvements.
arXiv Detail & Related papers (2020-11-25T15:36:16Z) - The Ridgelet Prior: A Covariance Function Approach to Prior
Specification for Bayesian Neural Networks [4.307812758854161]
We construct a prior distribution for the parameters of a network that approximates the posited Gaussian process in the output space of the network.
This establishes the property that a Bayesian neural network can approximate any Gaussian process whose covariance function is sufficiently regular.
arXiv Detail & Related papers (2020-10-16T16:39:45Z) - Exploring the Uncertainty Properties of Neural Networks' Implicit Priors
in the Infinite-Width Limit [47.324627920761685]
We use recent theoretical advances that characterize the function-space prior to an ensemble of infinitely-wide NNs as a Gaussian process.
This gives us a better understanding of the implicit prior NNs place on function space.
We also examine the calibration of previous approaches to classification with the NNGP.
arXiv Detail & Related papers (2020-10-14T18:41:54Z) - Revisiting Initialization of Neural Networks [72.24615341588846]
We propose a rigorous estimation of the global curvature of weights across layers by approximating and controlling the norm of their Hessian matrix.
Our experiments on Word2Vec and the MNIST/CIFAR image classification tasks confirm that tracking the Hessian norm is a useful diagnostic tool.
arXiv Detail & Related papers (2020-04-20T18:12:56Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.