Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise
- URL: http://arxiv.org/abs/2103.08497v2
- Date: Tue, 16 Mar 2021 07:46:18 GMT
- Title: Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise
- Authors: Jannik Schmitt and Stefan Roth
- Abstract summary: We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
- Score: 51.080620762639434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To adopt neural networks in safety critical domains, knowing whether we can
trust their predictions is crucial. Bayesian neural networks (BNNs) provide
uncertainty estimates by averaging predictions with respect to the posterior
weight distribution. Variational inference methods for BNNs approximate the
intractable weight posterior with a tractable distribution, yet mostly rely on
sampling from the variational distribution during training and inference.
Recent sampling-free approaches offer an alternative, but incur a significant
parameter overhead. We here propose a more efficient parameterization of the
posterior approximation for sampling-free variational inference that relies on
the distribution induced by multiplicative Gaussian activation noise. This
allows us to combine parameter efficiency with the benefits of sampling-free
variational inference. Our approach yields competitive results for standard
regression problems and scales well to large-scale image classification tasks
including ImageNet.
Related papers
- A Framework for Variational Inference of Lightweight Bayesian Neural
Networks with Heteroscedastic Uncertainties [0.31457219084519006]
We show that both the heteroscedastic aleatoric and epistemic variance can be embedded into the variances of learned BNN parameters.
We introduce a relatively simple framework for sampling-free variational inference suitable for lightweight BNNs.
arXiv Detail & Related papers (2024-02-22T13:24:43Z) - Uncertainty Quantification via Stable Distribution Propagation [60.065272548502]
We propose a new approach for propagating stable probability distributions through neural networks.
Our method is based on local linearization, which we show to be an optimal approximation in terms of total variation distance for the ReLU non-linearity.
arXiv Detail & Related papers (2024-02-13T09:40:19Z) - A Compact Representation for Bayesian Neural Networks By Removing
Permutation Symmetry [22.229664343428055]
We show that the role of permutations can be meaningfully quantified by a number of transpositions metric.
We then show that the recently proposed rebasin method allows us to summarize HMC samples into a compact representation.
We show that this compact representation allows us to compare trained BNNs directly in weight space across sampling methods and variational inference.
arXiv Detail & Related papers (2023-12-31T23:57:05Z) - Tractable Function-Space Variational Inference in Bayesian Neural
Networks [72.97620734290139]
A popular approach for estimating the predictive uncertainty of neural networks is to define a prior distribution over the network parameters.
We propose a scalable function-space variational inference method that allows incorporating prior information.
We show that the proposed method leads to state-of-the-art uncertainty estimation and predictive performance on a range of prediction tasks.
arXiv Detail & Related papers (2023-12-28T18:33:26Z) - Implicit Variational Inference for High-Dimensional Posteriors [7.924706533725115]
In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution.
We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors.
Our approach introduces novel bounds for approximate inference using implicit distributions by locally linearising the neural sampler.
arXiv Detail & Related papers (2023-10-10T14:06:56Z) - Collapsed Inference for Bayesian Deep Learning [36.1725075097107]
We introduce a novel collapsed inference scheme that performs Bayesian model averaging using collapsed samples.
A collapsed sample represents uncountably many models drawn from the approximate posterior.
Our proposed use of collapsed samples achieves a balance between scalability and accuracy.
arXiv Detail & Related papers (2023-06-16T08:34:42Z) - Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN)
VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers.
In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z) - Kalman Bayesian Neural Networks for Closed-form Online Learning [5.220940151628734]
We propose a novel approach for BNN learning via closed-form Bayesian inference.
The calculation of the predictive distribution of the output and the update of the weight distribution are treated as Bayesian filtering and smoothing problems.
This allows closed-form expressions for training the network's parameters in a sequential/online fashion without gradient descent.
arXiv Detail & Related papers (2021-10-03T07:29:57Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs)
These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.