Amortized Variational Inference: When and Why?
- URL: http://arxiv.org/abs/2307.11018v4
- Date: Thu, 23 May 2024 23:36:50 GMT
- Title: Amortized Variational Inference: When and Why?
- Authors: Charles C. Margossian, David M. Blei,
- Abstract summary: Amortized variational inference (A-VI) learns a common inference function, which maps each observation to its corresponding latent variable's approximate posterior.
We derive conditions on a latent variable model which are necessary, sufficient, and verifiable under which A-VI can attain F-VI's optimal solution.
- Score: 17.1222896154385
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In a probabilistic latent variable model, factorized (or mean-field) variational inference (F-VI) fits a separate parametric distribution for each latent variable. Amortized variational inference (A-VI) instead learns a common inference function, which maps each observation to its corresponding latent variable's approximate posterior. Typically, A-VI is used as a step in the training of variational autoencoders, however it stands to reason that A-VI could also be used as a general alternative to F-VI. In this paper we study when and why A-VI can be used for approximate Bayesian inference. We derive conditions on a latent variable model which are necessary, sufficient, and verifiable under which A-VI can attain F-VI's optimal solution, thereby closing the amortization gap. We prove these conditions are uniquely verified by simple hierarchical models, a broad class that encompasses many models in machine learning. We then show, on a broader class of models, how to expand the domain of AVI's inference function to improve its solution, and we provide examples, e.g. hidden Markov models, where the amortization gap cannot be closed.
Related papers
- SoftCVI: Contrastive variational inference with self-generated soft labels [2.5398014196797614]
Variational inference and Markov chain Monte Carlo methods are the predominant tools for this task.
We introduce Soft Contrastive Variational Inference (SoftCVI), which allows a family of variational objectives to be derived through a contrastive estimation framework.
We find that SoftCVI can be used to form objectives which are stable to train and mass-covering, frequently outperforming inference with other variational approaches.
arXiv Detail & Related papers (2024-07-22T14:54:12Z) - Diffusion models for probabilistic programming [56.47577824219207]
Diffusion Model Variational Inference (DMVI) is a novel method for automated approximate inference in probabilistic programming languages (PPLs)
DMVI is easy to implement, allows hassle-free inference in PPLs without the drawbacks of, e.g., variational inference using normalizing flows, and does not make any constraints on the underlying neural network model.
arXiv Detail & Related papers (2023-11-01T12:17:05Z) - PAVI: Plate-Amortized Variational Inference [55.975832957404556]
Inference is challenging for large population studies where millions of measurements are performed over a cohort of hundreds of subjects.
This large cardinality renders off-the-shelf Variational Inference (VI) computationally impractical.
In this work, we design structured VI families that efficiently tackle large population studies.
arXiv Detail & Related papers (2023-08-30T13:22:20Z) - Black Box Variational Inference with a Deterministic Objective: Faster,
More Accurate, and Even More Black Box [14.362625828893654]
We introduce "deterministic ADVI" (DADVI) to address issues with ADVI.
DADVI replaces the intractable MFVB objective with a fixed Monte Carlo approximation.
We show that DADVI and the SAA can perform well with relatively few samples even in very high dimensions.
arXiv Detail & Related papers (2023-04-11T22:45:18Z) - Flexible Amortized Variational Inference in qBOLD MRI [56.4324135502282]
Oxygen extraction fraction (OEF) and deoxygenated blood volume (DBV) are more ambiguously determined from the data.
Existing inference methods tend to yield very noisy and underestimated OEF maps, while overestimating DBV.
This work describes a novel probabilistic machine learning approach that can infer plausible distributions of OEF and DBV.
arXiv Detail & Related papers (2022-03-11T10:47:16Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - An Introduction to Variational Inference [0.0]
In this paper, we introduce the concept of Variational Inference (VI)
VI is a popular method in machine learning that uses optimization techniques to estimate complex probability densities.
We discuss the applications of VI to variational auto-encoders (VAE) and VAE-Generative Adversarial Network (VAE-GAN)
arXiv Detail & Related papers (2021-08-30T09:40:04Z) - Loss function based second-order Jensen inequality and its application
to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution.
PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models.
We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z) - f-Divergence Variational Inference [9.172478956440216]
The $f$-VI framework unifies a number of existing VI methods.
A general $f$-variational bound is derived and provides a sandwich estimate of marginal likelihood (or evidence)
A mean-field approximation scheme that generalizes the well-known coordinate ascent variational inference is also proposed for $f$-VI.
arXiv Detail & Related papers (2020-09-28T06:22:05Z) - Learning Disentangled Representations with Latent Variation
Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations.
Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs.
We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.