The Ridgelet Prior: A Covariance Function Approach to Prior
Specification for Bayesian Neural Networks
- URL: http://arxiv.org/abs/2010.08488v4
- Date: Tue, 11 Jan 2022 13:56:34 GMT
- Title: The Ridgelet Prior: A Covariance Function Approach to Prior
Specification for Bayesian Neural Networks
- Authors: Takuo Matsubara and Chris J. Oates and Fran\c{c}ois-Xavier Briol
- Abstract summary: We construct a prior distribution for the parameters of a network that approximates the posited Gaussian process in the output space of the network.
This establishes the property that a Bayesian neural network can approximate any Gaussian process whose covariance function is sufficiently regular.
- Score: 4.307812758854161
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Bayesian neural networks attempt to combine the strong predictive performance
of neural networks with formal quantification of uncertainty associated with
the predictive output in the Bayesian framework. However, it remains unclear
how to endow the parameters of the network with a prior distribution that is
meaningful when lifted into the output space of the network. A possible
solution is proposed that enables the user to posit an appropriate Gaussian
process covariance function for the task at hand. Our approach constructs a
prior distribution for the parameters of the network, called a ridgelet prior,
that approximates the posited Gaussian process in the output space of the
network. In contrast to existing work on the connection between neural networks
and Gaussian processes, our analysis is non-asymptotic, with finite sample-size
error bounds provided. This establishes the universality property that a
Bayesian neural network can approximate any Gaussian process whose covariance
function is sufficiently regular. Our experimental assessment is limited to a
proof-of-concept, where we demonstrate that the ridgelet prior can out-perform
an unstructured prior on regression problems for which a suitable Gaussian
process prior can be provided.
Related papers
- Finite Neural Networks as Mixtures of Gaussian Processes: From Provable Error Bounds to Prior Selection [11.729744197698718]
We present an algorithmic framework to approximate a neural network of finite width and depth.
We iteratively approximate the output distribution of each layer of the neural network as a mixture of Gaussian processes.
Our results can represent an important step towards understanding neural network predictions.
arXiv Detail & Related papers (2024-07-26T12:45:53Z) - Tractable Function-Space Variational Inference in Bayesian Neural
Networks [72.97620734290139]
A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters.
We propose a scalable function-space variational inference method that allows incorporating prior information.
We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks.
arXiv Detail & Related papers (2023-12-28T18:33:26Z) - Wide Deep Neural Networks with Gaussian Weights are Very Close to
Gaussian Processes [1.0878040851638]
We show that the distance between the network output and the corresponding Gaussian approximation scales inversely with the width of the network, exhibiting faster convergence than the naive suggested by the central limit theorem.
We also apply our bounds to obtain theoretical approximations for the exact posterior distribution of the network, when the likelihood is a bounded Lipschitz function of the network output evaluated on a (finite) training set.
arXiv Detail & Related papers (2023-12-18T22:29:40Z) - Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization [73.80101701431103]
The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks.
We study the usefulness of the LLA in Bayesian optimization and highlight its strong performance and flexibility.
arXiv Detail & Related papers (2023-04-17T14:23:43Z) - Semantic Strengthening of Neuro-Symbolic Learning [85.6195120593625]
Neuro-symbolic approaches typically resort to fuzzy approximations of a probabilistic objective.
We show how to compute this efficiently for tractable circuits.
We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles.
arXiv Detail & Related papers (2023-02-28T00:04:22Z) - Probabilistic Verification of ReLU Neural Networks via Characteristic
Functions [11.489187712465325]
We use ideas from probability theory in the frequency domain to provide probabilistic verification guarantees for ReLU neural networks.
We interpret a (deep) feedforward neural network as a discrete dynamical system over a finite horizon.
We obtain the corresponding cumulative distribution function of the output set, which can be used to check if the network is performing as expected.
arXiv Detail & Related papers (2022-12-03T05:53:57Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Precise characterization of the prior predictive distribution of deep
ReLU networks [45.46732383818331]
We derive a precise characterization of the prior predictive distribution of finite-width ReLU networks with Gaussian weights.
Our results provide valuable guidance on prior design, for instance, controlling the predictive variance with depth- and width-informed priors on the weights of the network.
arXiv Detail & Related papers (2021-06-11T21:21:52Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - All You Need is a Good Functional Prior for Bayesian Deep Learning [15.10662960548448]
We argue that this is a hugely limiting aspect of Bayesian deep learning.
We propose a novel and robust framework to match their prior with the functional prior of neural networks.
We provide vast experimental evidence that coupling these priors with scalable Markov chain Monte Carlo sampling offers systematically large performance improvements.
arXiv Detail & Related papers (2020-11-25T15:36:16Z) - Bayesian Deep Learning and a Probabilistic Perspective of Generalization [56.69671152009899]
We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization.
We also propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction.
arXiv Detail & Related papers (2020-02-20T15:13:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.