On the detrimental effect of invariances in the likelihood for
variational inference
- URL: http://arxiv.org/abs/2209.07157v1
- Date: Thu, 15 Sep 2022 09:13:30 GMT
- Title: On the detrimental effect of invariances in the likelihood for
variational inference
- Authors: Richard Kurle, Ralf Herbrich, Tim Januschowski, Yuyang Wang, Jan
Gasthaus
- Abstract summary: Variational Bayesian posterior inference often requires simplifying approximations such as mean-field parametrisation to ensure tractability.
Prior work has associated the variational mean-field approximation for Bayesian neural networks with underfitting in the case of small datasets or large model sizes.
- Score: 21.912271882110986
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational Bayesian posterior inference often requires simplifying
approximations such as mean-field parametrisation to ensure tractability.
However, prior work has associated the variational mean-field approximation for
Bayesian neural networks with underfitting in the case of small datasets or
large model sizes. In this work, we show that invariances in the likelihood
function of over-parametrised models contribute to this phenomenon because
these invariances complicate the structure of the posterior by introducing
discrete and/or continuous modes which cannot be well approximated by Gaussian
mean-field distributions. In particular, we show that the mean-field
approximation has an additional gap in the evidence lower bound compared to a
purpose-built posterior that takes into account the known invariances.
Importantly, this invariance gap is not constant; it vanishes as the
approximation reverts to the prior. We proceed by first considering translation
invariances in a linear model with a single data point in detail. We show that,
while the true posterior can be constructed from a mean-field parametrisation,
this is achieved only if the objective function takes into account the
invariance gap. Then, we transfer our analysis of the linear model to neural
networks. Our analysis provides a framework for future work to explore
solutions to the invariance problem.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.