Batch and match: black-box variational inference with a score-based divergence
- URL: http://arxiv.org/abs/2402.14758v2
- Date: Wed, 12 Jun 2024 16:53:22 GMT
- Title: Batch and match: black-box variational inference with a score-based divergence
- Authors: Diana Cai, Chirag Modi, Loucas Pillaud-Vivien, Charles C. Margossian, Robert M. Gower, David M. Blei, Lawrence K. Saul,
- Abstract summary: We propose batch and match (BaM) as an alternative approach to blackbox variational inference (BBVI) based on a score-based divergence.
We show that BaM converges in fewer evaluations than leading implementations of BBVI based on ELBO.
- Score: 26.873037094654826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most leading implementations of black-box variational inference (BBVI) are based on optimizing a stochastic evidence lower bound (ELBO). But such approaches to BBVI often converge slowly due to the high variance of their gradient estimates and their sensitivity to hyperparameters. In this work, we propose batch and match (BaM), an alternative approach to BBVI based on a score-based divergence. Notably, this score-based divergence can be optimized by a closed-form proximal update for Gaussian variational families with full covariance matrices. We analyze the convergence of BaM when the target distribution is Gaussian, and we prove that in the limit of infinite batch size the variational parameter updates converge exponentially quickly to the target mean and covariance. We also evaluate the performance of BaM on Gaussian and non-Gaussian target distributions that arise from posterior inference in hierarchical and deep generative models. In these experiments, we find that BaM typically converges in fewer (and sometimes significantly fewer) gradient evaluations than leading implementations of BBVI based on ELBO maximization.
Related papers
- EigenVI: score-based variational inference with orthogonal function expansions [23.696028065251497]
EigenVI is an eigenvalue-based approach for black-box variational inference (BBVI)
We use EigenVI to approximate a variety of target distributions, including a benchmark suite of Bayesian models from posteriordb.
arXiv Detail & Related papers (2024-10-31T15:48:34Z) - Batch, match, and patch: low-rank approximations for score-based variational inference [8.840147522046651]
Black-box variational inference scales poorly to high dimensional problems.
We extend the batch-and-match framework for score-based BBVI.
We evaluate this approach on a variety of synthetic target distributions and real-world problems in high-dimensional inference.
arXiv Detail & Related papers (2024-10-29T17:42:56Z) - Theoretical Guarantees for Variational Inference with Fixed-Variance Mixture of Gaussians [27.20127082606962]
Variational inference (VI) is a popular approach in Bayesian inference.
This work aims to contribute to the theoretical study of VI in the non-Gaussian case.
arXiv Detail & Related papers (2024-06-06T12:38:59Z) - Covariance-Adaptive Sequential Black-box Optimization for Diffusion Targeted Generation [60.41803046775034]
We show how to perform user-preferred targeted generation via diffusion models with only black-box target scores of users.
Experiments on both numerical test problems and target-guided 3D-molecule generation tasks show the superior performance of our method in achieving better target scores.
arXiv Detail & Related papers (2024-06-02T17:26:27Z) - Moreau Envelope ADMM for Decentralized Weakly Convex Optimization [55.2289666758254]
This paper proposes a proximal variant of the alternating direction method of multipliers (ADMM) for distributed optimization.
The results of our numerical experiments indicate that our method is faster and more robust than widely-used approaches.
arXiv Detail & Related papers (2023-08-31T14:16:30Z) - On the Convergence of Black-Box Variational Inference [16.895490556279647]
We provide the first convergence guarantee for full black-box variational inference (BBVI)
Our results hold for log-smooth posterior densities with and without strong log-concavity and the location-scale variational family.
arXiv Detail & Related papers (2023-05-24T16:59:50Z) - Scalable Bayesian Transformed Gaussian Processes [10.33253403416662]
The Bayesian transformed Gaussian process (BTG) model is a fully Bayesian counterpart to the warped Gaussian process (WGP)
We propose principled and fast techniques for computing with BTG.
Our framework uses doubly sparse quadrature rules, tight quantile bounds, and rank-one matrix algebra to enable both fast model prediction and model selection.
arXiv Detail & Related papers (2022-10-20T02:45:10Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and
Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement.
For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts.
We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Likelihood-Free Inference with Deep Gaussian Processes [70.74203794847344]
Surrogate models have been successfully used in likelihood-free inference to decrease the number of simulator evaluations.
We propose a Deep Gaussian Process (DGP) surrogate model that can handle more irregularly behaved target distributions.
Our experiments show how DGPs can outperform GPs on objective functions with multimodal distributions and maintain a comparable performance in unimodal cases.
arXiv Detail & Related papers (2020-06-18T14:24:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.