An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their
Asymptotic Overconfidence
- URL: http://arxiv.org/abs/2010.02709v5
- Date: Mon, 24 Jan 2022 14:01:26 GMT
- Title: An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their
Asymptotic Overconfidence
- Authors: Agustinus Kristiadi, Matthias Hein, Philipp Hennig
- Abstract summary: A Bayesian treatment can mitigate overconfidence in ReLU nets around the training data.
But far away from them, ReLU neural networks (BNNs) can still underestimate uncertainty and thus be overconfident.
We show that it can be applied emphpost-hoc to any pre-trained ReLU BNN at a low cost.
- Score: 65.24701908364383
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A Bayesian treatment can mitigate overconfidence in ReLU nets around the
training data. But far away from them, ReLU Bayesian neural networks (BNNs) can
still underestimate uncertainty and thus be asymptotically overconfident. This
issue arises since the output variance of a BNN with finitely many features is
quadratic in the distance from the data region. Meanwhile, Bayesian linear
models with ReLU features converge, in the infinite-width limit, to a
particular Gaussian process (GP) with a variance that grows cubically so that
no asymptotic overconfidence can occur. While this may seem of mostly
theoretical interest, in this work, we show that it can be used in practice to
the benefit of BNNs. We extend finite ReLU BNNs with infinite ReLU features via
the GP and show that the resulting model is asymptotically maximally uncertain
far away from the data while the BNNs' predictive power is unaffected near the
data. Although the resulting model approximates a full GP posterior, thanks to
its structure, it can be applied \emph{post-hoc} to any pre-trained ReLU BNN at
a low cost.
Related papers
- Approximation Bounds for Recurrent Neural Networks with Application to Regression [7.723218675113336]
We study the approximation capacity of deep ReLU recurrent neural networks (RNNs) and explore the convergence properties of nonparametric least squares regression using RNNs.
We derive upper bounds on the approximation error of RNNs for H"older smooth functions.
Our results provide statistical guarantees on the performance of RNNs.
arXiv Detail & Related papers (2024-09-09T13:02:50Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Constraining cosmological parameters from N-body simulations with
Variational Bayesian Neural Networks [0.0]
Multiplicative normalizing flows (MNFs) are a family of approximate posteriors for the parameters of BNNs.
We have compared MNFs with respect to the standard BNNs, and the flipout estimator.
MNFs provide more realistic predictive distribution closer to the true posterior mitigating the bias introduced by the variational approximation.
arXiv Detail & Related papers (2023-01-09T16:07:48Z) - Wide Mean-Field Bayesian Neural Networks Ignore the Data [29.050507540280922]
We show that mean-field variational inference entirely fails to model the data when the network width is large.
We show that the optimal approximate posterior need not tend to the prior if the activation function is not odd.
arXiv Detail & Related papers (2022-02-23T18:21:50Z) - A Biased Graph Neural Network Sampler with Near-Optimal Regret [57.70126763759996]
Graph neural networks (GNN) have emerged as a vehicle for applying deep network architectures to graph and relational data.
In this paper, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem.
We introduce a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts.
arXiv Detail & Related papers (2021-03-01T15:55:58Z) - Online Limited Memory Neural-Linear Bandits with Likelihood Matching [53.18698496031658]
We study neural-linear bandits for solving problems where both exploration and representation learning play an important role.
We propose a likelihood matching algorithm that is resilient to catastrophic forgetting and is completely online.
arXiv Detail & Related papers (2021-02-07T14:19:07Z) - Exploring the Uncertainty Properties of Neural Networks' Implicit Priors
in the Infinite-Width Limit [47.324627920761685]
We use recent theoretical advances that characterize the function-space prior to an ensemble of infinitely-wide NNs as a Gaussian process.
This gives us a better understanding of the implicit prior NNs place on function space.
We also examine the calibration of previous approaches to classification with the NNGP.
arXiv Detail & Related papers (2020-10-14T18:41:54Z) - Exact posterior distributions of wide Bayesian neural networks [51.20413322972014]
We show that the exact BNN posterior converges (weakly) to the one induced by the GP limit of the prior.
For empirical validation, we show how to generate exact samples from a finite BNN on a small dataset via rejection sampling.
arXiv Detail & Related papers (2020-06-18T13:57:04Z) - Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks [65.24701908364383]
We show that a sufficient condition for a uncertainty on a ReLU network is "to be a bit Bayesian calibrated"
We further validate these findings empirically via various standard experiments using common deep ReLU networks and Laplace approximations.
arXiv Detail & Related papers (2020-02-24T08:52:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.