Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence
- URL: http://arxiv.org/abs/2106.15980v1
- Date: Wed, 30 Jun 2021 11:00:24 GMT
- Title: Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence
- Authors: Ghassen Jerfel, Serena Wang, Clara Fannjiang, Katherine A. Heller,
Yian Ma, Michael I. Jordan
- Abstract summary: Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
- Score: 77.06203118175335
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Variational Inference (VI) is a popular alternative to asymptotically exact
sampling in Bayesian inference. Its main workhorse is optimization over a
reverse Kullback-Leibler divergence (RKL), which typically underestimates the
tail of the posterior leading to miscalibration and potential degeneracy.
Importance sampling (IS), on the other hand, is often used to fine-tune and
de-bias the estimates of approximate Bayesian inference procedures. The quality
of IS crucially depends on the choice of the proposal distribution. Ideally,
the proposal distribution has heavier tails than the target, which is rarely
achievable by minimizing the RKL. We thus propose a novel combination of
optimization and sampling techniques for approximate Bayesian inference by
constructing an IS proposal distribution through the minimization of a forward
KL (FKL) divergence. This approach guarantees asymptotic consistency and a fast
convergence towards both the optimal IS estimator and the optimal variational
approximation. We empirically demonstrate on real data that our method is
competitive with variational boosting and MCMC.
Related papers
- Contextual Optimization under Covariate Shift: A Robust Approach by Intersecting Wasserstein Balls [18.047245099229325]
We propose a distributionally robust approach that uses an ambiguity set by the intersection of two Wasserstein balls.
We demonstrate the strong empirical performance of our proposed models.
arXiv Detail & Related papers (2024-06-04T15:46:41Z) - Sequential Monte Carlo for Inclusive KL Minimization in Amortized Variational Inference [3.126959812401426]
We propose SMC-Wake, a procedure for fitting an amortized variational approximation that uses sequential Monte Carlo samplers to estimate the gradient of the inclusive KL divergence.
In experiments with both simulated and real datasets, SMC-Wake fits variational distributions that approximate the posterior more accurately than existing methods.
arXiv Detail & Related papers (2024-03-15T18:13:48Z) - Bayesian Pseudo-Coresets via Contrastive Divergence [5.479797073162603]
We introduce a novel approach for constructing pseudo-coresets by utilizing contrastive divergence.
It eliminates the need for approximations in the pseudo-coreset construction process.
We conduct extensive experiments on multiple datasets, demonstrating its superiority over existing BPC techniques.
arXiv Detail & Related papers (2023-03-20T17:13:50Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - Adaptive Importance Sampling meets Mirror Descent: a Bias-variance
tradeoff [7.538482310185135]
A major drawback of adaptive importance sampling is the large variance of the weights.
This paper investigates a regularization strategy whose basic principle is to raise the importance weights at a certain power.
arXiv Detail & Related papers (2021-10-29T07:45:24Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Adaptive Sampling for Estimating Distributions: A Bayesian Upper
Confidence Bound Approach [30.76846526324949]
A Bayesian variant of the existing upper confidence bound (UCB) based approaches is proposed.
The effectiveness of this strategy is discussed using data obtained from a seroprevalence survey in Los Angeles county.
arXiv Detail & Related papers (2020-12-08T00:53:34Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - Distributionally Robust Bayesian Optimization [121.71766171427433]
We present a novel distributionally robust Bayesian optimization algorithm (DRBO) for zeroth-order, noisy optimization.
Our algorithm provably obtains sub-linear robust regret in various settings.
We demonstrate the robust performance of our method on both synthetic and real-world benchmarks.
arXiv Detail & Related papers (2020-02-20T22:04:30Z) - Distributionally Robust Bayesian Quadrature Optimization [60.383252534861136]
We study BQO under distributional uncertainty in which the underlying probability distribution is unknown except for a limited set of its i.i.d. samples.
A standard BQO approach maximizes the Monte Carlo estimate of the true expected objective given the fixed sample set.
We propose a novel posterior sampling based algorithm, namely distributionally robust BQO (DRBQO) for this purpose.
arXiv Detail & Related papers (2020-01-19T12:00:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.