On Batch Normalisation for Approximate Bayesian Inference
- URL: http://arxiv.org/abs/2012.13220v1
- Date: Thu, 24 Dec 2020 12:40:11 GMT
- Title: On Batch Normalisation for Approximate Bayesian Inference
- Authors: Jishnu Mukhoti, Puneet K. Dokania, Philip H.S. Torr, Yarin Gal
- Abstract summary: We show that batch-normalisation does not affect the optimum of the evidence lower bound (ELBO)
We also study the Monte Carlo Batch Normalisation (MCBN) algorithm, proposed as an approximate inference technique parallel to MC Dropout.
- Score: 102.94525205971873
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study batch normalisation in the context of variational inference methods
in Bayesian neural networks, such as mean-field or MC Dropout. We show that
batch-normalisation does not affect the optimum of the evidence lower bound
(ELBO). Furthermore, we study the Monte Carlo Batch Normalisation (MCBN)
algorithm, proposed as an approximate inference technique parallel to MC
Dropout, and show that for larger batch sizes, MCBN fails to capture epistemic
uncertainty. Finally, we provide insights into what is required to fix this
failure, namely having to view the mini-batch size as a variational parameter
in MCBN. We comment on the asymptotics of the ELBO with respect to this
variational parameter, showing that as dataset size increases towards infinity,
the batch-size must increase towards infinity as well for MCBN to be a valid
approximate inference technique.
Related papers
- Adaptive Bayesian Multivariate Spline Knot Inference with Prior Specifications on Model Complexity [7.142818102750932]
In this article, we propose a fully Bayesian approach for knot inference in multivariate spline regression.
Experiments demonstrate the splendid capability of the algorithm, especially in function fitting with jumping discontinuity.
arXiv Detail & Related papers (2024-05-22T05:14:52Z) - Towards Practical Preferential Bayesian Optimization with Skew Gaussian
Processes [8.198195852439946]
We study preferential Bayesian optimization (BO) where reliable feedback is limited to pairwise comparison called duels.
An important challenge in preferential BO, which uses the preferential Gaussian process (GP) model to represent flexible preference structure, is that the posterior distribution is a computationally intractable skew GP.
We develop a new method that achieves both high computational efficiency and low sample complexity, and then demonstrate its effectiveness through extensive numerical experiments.
arXiv Detail & Related papers (2023-02-03T03:02:38Z) - GFlowOut: Dropout with Generative Flow Networks [76.59535235717631]
Monte Carlo Dropout has been widely used as a relatively cheap way for approximate Inference.
Recent works show that the dropout mask can be viewed as a latent variable, which can be inferred with variational inference.
GFlowOutleverages the recently proposed probabilistic framework of Generative Flow Networks (GFlowNets) to learn the posterior distribution over dropout masks.
arXiv Detail & Related papers (2022-10-24T03:00:01Z) - Data Subsampling for Bayesian Neural Networks [0.0]
Penalty Bayesian Neural Networks - PBNNs - are a new algorithm that allows the evaluation of the likelihood using subsampled batch data.
We show that PBNN achieves good predictive performance even for small mini-batch sizes of data.
arXiv Detail & Related papers (2022-10-17T14:43:35Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - Double Forward Propagation for Memorized Batch Normalization [68.34268180871416]
Batch Normalization (BN) has been a standard component in designing deep neural networks (DNNs)
We propose a memorized batch normalization (MBN) which considers multiple recent batches to obtain more accurate and robust statistics.
Compared to related methods, the proposed MBN exhibits consistent behaviors in both training and inference.
arXiv Detail & Related papers (2020-10-10T08:48:41Z) - Towards Stabilizing Batch Statistics in Backward Propagation of Batch
Normalization [126.6252371899064]
Moving Average Batch Normalization (MABN) is a novel normalization method.
We show that MABN can completely restore the performance of vanilla BN in small batch cases.
Our experiments demonstrate the effectiveness of MABN in multiple computer vision tasks including ImageNet and COCO.
arXiv Detail & Related papers (2020-01-19T14:41:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.