Posterior concentrations of fully-connected Bayesian neural networks with general priors on the weights
- URL: http://arxiv.org/abs/2403.14225v1
- Date: Thu, 21 Mar 2024 08:31:36 GMT
- Title: Posterior concentrations of fully-connected Bayesian neural networks with general priors on the weights
- Authors: Insung Kong, Yongdai Kim,
- Abstract summary: We present a new approximation theory for non-sparse Deep Neural Networks (DNNs) with bounded parameters.
We show that BNNs with non-sparse general priors can achieve near-minimax optimal posterior concentration rates to the true model.
- Score: 3.5865188519566003
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian approaches for training deep neural networks (BNNs) have received significant interest and have been effectively utilized in a wide range of applications. There have been several studies on the properties of posterior concentrations of BNNs. However, most of these studies only demonstrate results in BNN models with sparse or heavy-tailed priors. Surprisingly, no theoretical results currently exist for BNNs using Gaussian priors, which are the most commonly used one. The lack of theory arises from the absence of approximation results of Deep Neural Networks (DNNs) that are non-sparse and have bounded parameters. In this paper, we present a new approximation theory for non-sparse DNNs with bounded parameters. Additionally, based on the approximation theory, we show that BNNs with non-sparse general priors can achieve near-minimax optimal posterior concentration rates to the true model.
Related papers
- Bayesian Neural Networks with Domain Knowledge Priors [52.80929437592308]
We propose a framework for integrating general forms of domain knowledge into a BNN prior.
We show that BNNs using our proposed domain knowledge priors outperform those with standard priors.
arXiv Detail & Related papers (2024-02-20T22:34:53Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Masked Bayesian Neural Networks : Theoretical Guarantee and its
Posterior Inference [1.2722697496405464]
We propose a new node-sparse BNN model which has good theoretical properties and is computationally feasible.
We prove that the posterior concentration rate to the true model is near minimax optimal and adaptive to the smoothness of the true model.
In addition, we develop a novel MCMC algorithm which makes the Bayesian inference of the node-sparse BNN model feasible in practice.
arXiv Detail & Related papers (2023-05-24T06:16:11Z) - Incorporating Unlabelled Data into Bayesian Neural Networks [48.25555899636015]
We introduce Self-Supervised Bayesian Neural Networks, which use unlabelled data to learn models with suitable prior predictive distributions.
We show that the prior predictive distributions of self-supervised BNNs capture problem semantics better than conventional BNN priors.
Our approach offers improved predictive performance over conventional BNNs, especially in low-budget regimes.
arXiv Detail & Related papers (2023-04-04T12:51:35Z) - Constraining cosmological parameters from N-body simulations with
Variational Bayesian Neural Networks [0.0]
Multiplicative normalizing flows (MNFs) are a family of approximate posteriors for the parameters of BNNs.
We have compared MNFs with respect to the standard BNNs, and the flipout estimator.
MNFs provide more realistic predictive distribution closer to the true posterior mitigating the bias introduced by the variational approximation.
arXiv Detail & Related papers (2023-01-09T16:07:48Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Wide Mean-Field Bayesian Neural Networks Ignore the Data [29.050507540280922]
We show that mean-field variational inference entirely fails to model the data when the network width is large.
We show that the optimal approximate posterior need not tend to the prior if the activation function is not odd.
arXiv Detail & Related papers (2022-02-23T18:21:50Z) - An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their
Asymptotic Overconfidence [65.24701908364383]
A Bayesian treatment can mitigate overconfidence in ReLU nets around the training data.
But far away from them, ReLU neural networks (BNNs) can still underestimate uncertainty and thus be overconfident.
We show that it can be applied emphpost-hoc to any pre-trained ReLU BNN at a low cost.
arXiv Detail & Related papers (2020-10-06T13:32:18Z) - Exact posterior distributions of wide Bayesian neural networks [51.20413322972014]
We show that the exact BNN posterior converges (weakly) to the one induced by the GP limit of the prior.
For empirical validation, we show how to generate exact samples from a finite BNN on a small dataset via rejection sampling.
arXiv Detail & Related papers (2020-06-18T13:57:04Z) - Prior choice affects ability of Bayesian neural networks to identify
unknowns [0.0]
We show that the choice of priors has a substantial impact on the ability of the model to confidently assign data to the correct class.
We also show that testing alternative options can improve the performance of BNNs.
arXiv Detail & Related papers (2020-05-11T10:32:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.