Hybrid Bayesian Neural Networks with Functional Probabilistic Layers
- URL: http://arxiv.org/abs/2107.07014v1
- Date: Wed, 14 Jul 2021 21:25:53 GMT
- Title: Hybrid Bayesian Neural Networks with Functional Probabilistic Layers
- Authors: Daniel T. Chang
- Abstract summary: We propose hybrid Bayesian neural networks with functional probabilistic layers that encode function uncertainty.
We perform few proof-of-concept experiments using GPflus, a new library that provides Gaussian process layers.
- Score: 0.6091702876917281
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian neural networks provide a direct and natural way to extend standard
deep neural networks to support probabilistic deep learning through the use of
probabilistic layers that, traditionally, encode weight (and bias) uncertainty.
In particular, hybrid Bayesian neural networks utilize standard deterministic
layers together with few probabilistic layers judicially positioned in the
networks for uncertainty estimation. A major aspect and benefit of Bayesian
inference is that priors, in principle, provide the means to encode prior
knowledge for use in inference and prediction. However, it is difficult to
specify priors on weights since the weights have no intuitive interpretation.
Further, the relationships of priors on weights to the functions computed by
networks are difficult to characterize. In contrast, functions are intuitive to
interpret and are direct since they map inputs to outputs. Therefore, it is
natural to specify priors on functions to encode prior knowledge, and to use
them in inference and prediction based on functions. To support this, we
propose hybrid Bayesian neural networks with functional probabilistic layers
that encode function (and activation) uncertainty. We discuss their foundations
in functional Bayesian inference, functional variational inference, sparse
Gaussian processes, and sparse variational Gaussian processes. We further
perform few proof-of-concept experiments using GPflus, a new library that
provides Gaussian process layers and supports their use with deterministic
Keras layers to form hybrid neural network and Gaussian process models.
Related papers
- Tractable Function-Space Variational Inference in Bayesian Neural
Networks [72.97620734290139]
A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters.
We propose a scalable function-space variational inference method that allows incorporating prior information.
We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks.
arXiv Detail & Related papers (2023-12-28T18:33:26Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - Permutation Equivariant Neural Functionals [92.0667671999604]
This work studies the design of neural networks that can process the weights or gradients of other neural networks.
We focus on the permutation symmetries that arise in the weights of deep feedforward networks because hidden layer neurons have no inherent order.
In our experiments, we find that permutation equivariant neural functionals are effective on a diverse set of tasks.
arXiv Detail & Related papers (2023-02-27T18:52:38Z) - Neural networks trained with SGD learn distributions of increasing
complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics.
We then exploit higher-order statistics only later during training.
We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z) - Bayesian Neural Networks: Essentials [0.6091702876917281]
It is nontrivial to understand, design and train Bayesian neural networks due to their complexities.
Deep neural networks makes it redundant, and costly, to account for uncertainty for a large number of successive layers.
Hybrid Bayesian neural networks, which use few probabilistic layers judicially positioned in the networks, provide a practical solution.
arXiv Detail & Related papers (2021-06-22T13:54:17Z) - BNNpriors: A library for Bayesian neural network inference with
different prior distributions [32.944046414823916]
BNNpriors enables state-of-the-art Markov Chain Monte Carlo inference on Bayesian neural networks.
It follows a modular approach that eases the design and implementation of new custom priors.
It has facilitated foundational discoveries on the nature of the cold posterior effect in Bayesian neural networks.
arXiv Detail & Related papers (2021-05-14T17:11:04Z) - All You Need is a Good Functional Prior for Bayesian Deep Learning [15.10662960548448]
We argue that this is a hugely limiting aspect of Bayesian deep learning.
We propose a novel and robust framework to match their prior with the functional prior of neural networks.
We provide vast experimental evidence that coupling these priors with scalable Markov chain Monte Carlo sampling offers systematically large performance improvements.
arXiv Detail & Related papers (2020-11-25T15:36:16Z) - The Ridgelet Prior: A Covariance Function Approach to Prior
Specification for Bayesian Neural Networks [4.307812758854161]
We construct a prior distribution for the parameters of a network that approximates the posited Gaussian process in the output space of the network.
This establishes the property that a Bayesian neural network can approximate any Gaussian process whose covariance function is sufficiently regular.
arXiv Detail & Related papers (2020-10-16T16:39:45Z) - Consistent feature selection for neural networks via Adaptive Group
Lasso [3.42658286826597]
We propose and establish a theoretical guarantee for the use of the adaptive group for selecting important features of neural networks.
Specifically, we show that our feature selection method is consistent for single-output feed-forward neural networks with one hidden layer and hyperbolic tangent activation function.
arXiv Detail & Related papers (2020-05-30T18:50:56Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.