Related papers: On the detrimental effect of invariances in the likelihood for variational inference

On the detrimental effect of invariances in the likelihood for variational inference

URL: http://arxiv.org/abs/2209.07157v1
Date: Thu, 15 Sep 2022 09:13:30 GMT
Title: On the detrimental effect of invariances in the likelihood for variational inference
Authors: Richard Kurle, Ralf Herbrich, Tim Januschowski, Yuyang Wang, Jan Gasthaus
Abstract summary: Variational Bayesian posterior inference often requires simplifying approximations such as mean-field parametrisation to ensure tractability. Prior work has associated the variational mean-field approximation for Bayesian neural networks with underfitting in the case of small datasets or large model sizes.
Score: 21.912271882110986
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Variational Bayesian posterior inference often requires simplifying approximations such as mean-field parametrisation to ensure tractability. However, prior work has associated the variational mean-field approximation for Bayesian neural networks with underfitting in the case of small datasets or large model sizes. In this work, we show that invariances in the likelihood function of over-parametrised models contribute to this phenomenon because these invariances complicate the structure of the posterior by introducing discrete and/or continuous modes which cannot be well approximated by Gaussian mean-field distributions. In particular, we show that the mean-field approximation has an additional gap in the evidence lower bound compared to a purpose-built posterior that takes into account the known invariances. Importantly, this invariance gap is not constant; it vanishes as the approximation reverts to the prior. We proceed by first considering translation invariances in a linear model with a single data point in detail. We show that, while the true posterior can be constructed from a mean-field parametrisation, this is achieved only if the objective function takes into account the invariance gap. Then, we transfer our analysis of the linear model to neural networks. Our analysis provides a framework for future work to explore solutions to the invariance problem.

Related papers

Reparameterization invariance in approximate Bayesian inference [32.88960624085645]
We develop a new geometric view of reparametrizations from which we explain the success of linearization. We demonstrate that these re parameterization invariance properties can be extended to the original neural network predictive.
arXiv Detail & Related papers (2024-06-05T14:49:15Z)
Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$. We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z)
Reliable amortized variational inference with physics-based latent distribution correction [0.4588028371034407]
A neural network is trained to approximate the posterior distribution over existing pairs of model and data. The accuracy of this approach relies on the availability of high-fidelity training data. We show that our correction step improves the robustness of amortized variational inference with respect to changes in number of source experiments, noise variance, and shifts in the prior distribution.
arXiv Detail & Related papers (2022-07-24T02:38:54Z)
Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data. Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes. Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z)
Variational Causal Networks: Approximate Bayesian Inference over Causal Structures [132.74509389517203]
We introduce a parametric variational family modelled by an autoregressive distribution over the space of discrete DAGs. In experiments, we demonstrate that the proposed variational posterior is able to provide a good approximation of the true posterior.
arXiv Detail & Related papers (2021-06-14T17:52:49Z)
Reducing the Amortization Gap in Variational Autoencoders: A Bayesian Random Function Approach [38.45568741734893]
Inference in our GP model is done by a single feed forward pass through the network, significantly faster than semi-amortized methods. We show that our approach attains higher test data likelihood than the state-of-the-arts on several benchmark datasets.
arXiv Detail & Related papers (2021-02-05T13:01:12Z)
Understanding Variational Inference in Function-Space [20.940162027560408]
We highlight some advantages and limitations of employing the Kullback-Leibler divergence in this setting. We propose (featurized) Bayesian linear regression as a benchmark for function-space' inference methods that directly measures approximation quality.
arXiv Detail & Related papers (2020-11-18T17:42:01Z)
Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters. We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets. Part of the challenge of learning robust models lies in the influence of unobserved confounders. We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks [46.677567663908185]
Variational Bayesian Inference is a popular methodology for approxing posteriorimating over Bayesian neural network weights. Recent work has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. We find that by decomposing these variational parameters into a low-rank factorization, we can make our variational approximation more compact without decreasing the models' performance.
arXiv Detail & Related papers (2020-02-07T07:33:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.