All You Need is a Good Functional Prior for Bayesian Deep Learning
- URL: http://arxiv.org/abs/2011.12829v2
- Date: Mon, 25 Apr 2022 16:25:51 GMT
- Title: All You Need is a Good Functional Prior for Bayesian Deep Learning
- Authors: Ba-Hien Tran and Simone Rossi and Dimitrios Milios and Maurizio
Filippone
- Abstract summary: We argue that this is a hugely limiting aspect of Bayesian deep learning.
We propose a novel and robust framework to match their prior with the functional prior of neural networks.
We provide vast experimental evidence that coupling these priors with scalable Markov chain Monte Carlo sampling offers systematically large performance improvements.
- Score: 15.10662960548448
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Bayesian treatment of neural networks dictates that a prior distribution
is specified over their weight and bias parameters. This poses a challenge
because modern neural networks are characterized by a large number of
parameters, and the choice of these priors has an uncontrolled effect on the
induced functional prior, which is the distribution of the functions obtained
by sampling the parameters from their prior distribution. We argue that this is
a hugely limiting aspect of Bayesian deep learning, and this work tackles this
limitation in a practical and effective way. Our proposal is to reason in terms
of functional priors, which are easier to elicit, and to "tune" the priors of
neural network parameters in a way that they reflect such functional priors.
Gaussian processes offer a rigorous framework to define prior distributions
over functions, and we propose a novel and robust framework to match their
prior with the functional prior of neural networks based on the minimization of
their Wasserstein distance. We provide vast experimental evidence that coupling
these priors with scalable Markov chain Monte Carlo sampling offers
systematically large performance improvements over alternative choices of
priors and state-of-the-art approximate Bayesian deep learning approaches. We
consider this work a considerable step in the direction of making the
long-standing challenge of carrying out a fully Bayesian treatment of neural
networks, including convolutional neural networks, a concrete possibility.
Related papers
- High-Fidelity Transfer of Functional Priors for Wide Bayesian Neural Networks by Learning Activations [1.0468715529145969]
We show how trainable activations can accommodate complex function-space priors on BNNs.
We discuss critical learning challenges, including identifiability, loss construction, and symmetries.
Our empirical findings demonstrate that even BNNs with a single wide hidden layer, can effectively achieve high-fidelity function-space priors.
arXiv Detail & Related papers (2024-10-21T08:42:10Z) - Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP)
For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z) - Continual Learning via Sequential Function-Space Variational Inference [65.96686740015902]
We propose an objective derived by formulating continual learning as sequential function-space variational inference.
Compared to objectives that directly regularize neural network predictions, the proposed objective allows for more flexible variational distributions.
We demonstrate that, across a range of task sequences, neural networks trained via sequential function-space variational inference achieve better predictive accuracy than networks trained with related methods.
arXiv Detail & Related papers (2023-12-28T18:44:32Z) - Tractable Function-Space Variational Inference in Bayesian Neural
Networks [72.97620734290139]
A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters.
We propose a scalable function-space variational inference method that allows incorporating prior information.
We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks.
arXiv Detail & Related papers (2023-12-28T18:33:26Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Hybrid Bayesian Neural Networks with Functional Probabilistic Layers [0.6091702876917281]
We propose hybrid Bayesian neural networks with functional probabilistic layers that encode function uncertainty.
We perform few proof-of-concept experiments using GPflus, a new library that provides Gaussian process layers.
arXiv Detail & Related papers (2021-07-14T21:25:53Z) - BNNpriors: A library for Bayesian neural network inference with
different prior distributions [32.944046414823916]
BNNpriors enables state-of-the-art Markov Chain Monte Carlo inference on Bayesian neural networks.
It follows a modular approach that eases the design and implementation of new custom priors.
It has facilitated foundational discoveries on the nature of the cold posterior effect in Bayesian neural networks.
arXiv Detail & Related papers (2021-05-14T17:11:04Z) - Dimension-robust Function Space MCMC With Neural Network Priors [0.0]
This paper introduces a new prior on functions spaces which scales more favourably in the dimension of the function's domain.
We show that our resulting posterior of the unknown function is amenable to sampling using Hilbert space Markov chain Monte Carlo methods.
We show that our priors are competitive and have distinct advantages over other function space priors.
arXiv Detail & Related papers (2020-12-20T14:52:57Z) - The Ridgelet Prior: A Covariance Function Approach to Prior
Specification for Bayesian Neural Networks [4.307812758854161]
We construct a prior distribution for the parameters of a network that approximates the posited Gaussian process in the output space of the network.
This establishes the property that a Bayesian neural network can approximate any Gaussian process whose covariance function is sufficiently regular.
arXiv Detail & Related papers (2020-10-16T16:39:45Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.