Fast Predictive Uncertainty for Classification with Bayesian Deep
Networks
- URL: http://arxiv.org/abs/2003.01227v4
- Date: Tue, 31 May 2022 06:50:20 GMT
- Title: Fast Predictive Uncertainty for Classification with Bayesian Deep
Networks
- Authors: Marius Hobbhahn, Agustinus Kristiadi, Philipp Hennig
- Abstract summary: In Bayesian Deep Learning, distributions over the output of classification neural networks are approximated by first constructing a Gaussian distribution over the weights, then sampling from it to receive a distribution over the softmax outputs.
We construct a Dirichlet approximation of this softmax output distribution, which yields an analytic map between Gaussian distributions in logit space and Dirichlet distributions in the output space.
We demonstrate that the resulting Dirichlet distribution has multiple advantages, in particular, more efficient of the uncertainty estimate and scaling to large datasets and networks like ImageNet and DenseNet.
- Score: 25.821401066200504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Bayesian Deep Learning, distributions over the output of classification
neural networks are often approximated by first constructing a Gaussian
distribution over the weights, then sampling from it to receive a distribution
over the softmax outputs. This is costly. We reconsider old work (Laplace
Bridge) to construct a Dirichlet approximation of this softmax output
distribution, which yields an analytic map between Gaussian distributions in
logit space and Dirichlet distributions (the conjugate prior to the Categorical
distribution) in the output space. Importantly, the vanilla Laplace Bridge
comes with certain limitations. We analyze those and suggest a simple solution
that compares favorably to other commonly used estimates of the
softmax-Gaussian integral. We demonstrate that the resulting Dirichlet
distribution has multiple advantages, in particular, more efficient computation
of the uncertainty estimate and scaling to large datasets and networks like
ImageNet and DenseNet. We further demonstrate the usefulness of this Dirichlet
approximation by using it to construct a lightweight uncertainty-aware output
ranking for ImageNet.
Related papers
- Hessian-Free Laplace in Bayesian Deep Learning [44.16006844888796]
Hessian-free Laplace (HFL) approximation uses curvature of both the log posterior and network prediction to estimate its variance.
We show that, under standard assumptions of LA in Bayesian deep learning, HFL targets the same variance as LA, and can be efficiently amortized in a pre-trained network.
arXiv Detail & Related papers (2024-03-15T20:47:39Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - On diffusion-based generative models and their error bounds: The log-concave case with full convergence estimates [5.13323375365494]
We provide theoretical guarantees for the convergence behaviour of diffusion-based generative models under strongly log-concave data.
Our class of functions used for score estimation is made of Lipschitz continuous functions avoiding any Lipschitzness assumption on the score function.
This approach yields the best known convergence rate for our sampling algorithm.
arXiv Detail & Related papers (2023-11-22T18:40:45Z) - Generalized Schrödinger Bridge Matching [54.171931505066]
Generalized Schr"odinger Bridge (GSB) problem setup is prevalent in many scientific areas both within and without machine learning.
We propose Generalized Schr"odinger Bridge Matching (GSBM), a new matching algorithm inspired by recent advances.
We show that such a generalization can be cast as solving conditional optimal control, for which variational approximations can be used.
arXiv Detail & Related papers (2023-10-03T17:42:11Z) - Compound Batch Normalization for Long-tailed Image Classification [77.42829178064807]
We propose a compound batch normalization method based on a Gaussian mixture.
It can model the feature space more comprehensively and reduce the dominance of head classes.
The proposed method outperforms existing methods on long-tailed image classification.
arXiv Detail & Related papers (2022-12-02T07:31:39Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - On the capacity of deep generative networks for approximating
distributions [8.798333793391544]
We prove that neural networks can transform a one-dimensional source distribution to a distribution arbitrarily close to a high-dimensional target distribution in Wasserstein distances.
It is shown that the approximation error grows at most linearly on the ambient dimension.
$f$-divergences are less adequate than Waserstein distances as metrics of distributions for generating samples.
arXiv Detail & Related papers (2021-01-29T01:45:02Z) - Bayesian Deep Learning via Subnetwork Inference [2.2835610890984164]
We show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors.
This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets.
arXiv Detail & Related papers (2020-10-28T01:10:11Z) - Mean-Field Approximation to Gaussian-Softmax Integral with Application
to Uncertainty Estimation [23.38076756988258]
We propose a new single-model based approach to quantify uncertainty in deep neural networks.
We use a mean-field approximation formula to compute an analytically intractable integral.
Empirically, the proposed approach performs competitively when compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-06-13T07:32:38Z) - Bayesian Deep Learning and a Probabilistic Perspective of Generalization [56.69671152009899]
We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization.
We also propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction.
arXiv Detail & Related papers (2020-02-20T15:13:27Z) - Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples.
A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set.
We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.