Stability of Mean-Field Variational Inference
- URL: http://arxiv.org/abs/2506.07856v1
- Date: Mon, 09 Jun 2025 15:21:37 GMT
- Title: Stability of Mean-Field Variational Inference
- Authors: Shunan Sheng, Bohan Wu, Alberto González-Sanz, Marcel Nutz,
- Abstract summary: Mean-field inference (MFVI) is a widely used method for approxing high-dimensional probability distributions by product measures.<n>We show that the MFVI depends differentiably on the target potential and characterize the derivative by a partial differential equation.
- Score: 3.5729687931166136
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mean-field variational inference (MFVI) is a widely used method for approximating high-dimensional probability distributions by product measures. This paper studies the stability properties of the mean-field approximation when the target distribution varies within the class of strongly log-concave measures. We establish dimension-free Lipschitz continuity of the MFVI optimizer with respect to the target distribution, measured in the 2-Wasserstein distance, with Lipschitz constant inversely proportional to the log-concavity parameter. Under additional regularity conditions, we further show that the MFVI optimizer depends differentiably on the target potential and characterize the derivative by a partial differential equation. Methodologically, we follow a novel approach to MFVI via linearized optimal transport: the non-convex MFVI problem is lifted to a convex optimization over transport maps with a fixed base measure, enabling the use of calculus of variations and functional analysis. We discuss several applications of our results to robust Bayesian inference and empirical Bayes, including a quantitative Bernstein--von Mises theorem for MFVI, as well as to distributed stochastic control.
Related papers
- Variational Inference for Latent Variable Models in High Dimensions [4.3012765978447565]
We introduce a general framework for quantifying the statistical accuracy of mean-field variationalLDA (MFVI) for posterior approximation.<n>We capture the exact regime where MFVI works' for the celebrated latent Dirichlet allocation (model)<n>We propose a partially grouped VI algorithm for this model and show that it works, and derive its exact performance.
arXiv Detail & Related papers (2025-06-02T17:19:58Z) - Learning over von Mises-Fisher Distributions via a Wasserstein-like Geometry [0.0]
We introduce a geometry-aware distance metric for the family of von Mises-Fisher (vMF) distributions.<n>Motivated by the theory of optimal transport, we propose a Wasserstein-like distance that decomposes the discrepancy between two vMF distributions into two interpretable components.
arXiv Detail & Related papers (2025-04-19T03:38:15Z) - Stable Derivative Free Gaussian Mixture Variational Inference for Bayesian Inverse Problems [4.842853252452336]
Key challenges include costly repeated evaluations of forward models, multimodality, and inaccessible gradients for the forward model.<n>We develop a variational inference framework that combines Fisher-Rao natural gradient with specialized quadrature rules to enable derivative free updates of Gaussian mixture variational families.<n>The resulting method, termed Derivative Free Gaussian Mixture Variational Inference (DF-GMVI), guarantees covariance positivity and affine invariance, offering a stable and efficient framework for approximating complex posterior distributions.
arXiv Detail & Related papers (2025-01-08T03:50:15Z) - Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
We investigate the statistical properties of Temporal Difference learning with Polyak-Ruppert averaging.<n>We make three significant contributions that improve the current state-of-the-art results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z) - Flow matching achieves almost minimax optimal convergence [50.38891696297888]
Flow matching (FM) has gained significant attention as a simulation-free generative model.
This paper discusses the convergence properties of FM for large sample size under the $p$-Wasserstein distance.
We establish that FM can achieve an almost minimax optimal convergence rate for $1 leq p leq 2$, presenting the first theoretical evidence that FM can reach convergence rates comparable to those of diffusion models.
arXiv Detail & Related papers (2024-05-31T14:54:51Z) - Bayesian Model Selection via Mean-Field Variational Approximation [10.433170683584994]
We study the non-asymptotic properties of mean-field (MF) inference under the Bayesian framework.
We show a Bernstein von-Mises (BvM) theorem for the variational distribution from MF under possible model mis-specification.
arXiv Detail & Related papers (2023-12-17T04:48:25Z) - Moreau Envelope ADMM for Decentralized Weakly Convex Optimization [55.2289666758254]
This paper proposes a proximal variant of the alternating direction method of multipliers (ADMM) for distributed optimization.
The results of our numerical experiments indicate that our method is faster and more robust than widely-used approaches.
arXiv Detail & Related papers (2023-08-31T14:16:30Z) - Mean-field Variational Inference via Wasserstein Gradient Flow [8.05603983337769]
Variational inference, such as the mean-field (MF) approximation, requires certain conjugacy structures for efficient computation.
We introduce a general computational framework to implement MFal inference for Bayesian models, with or without latent variables, using the Wasserstein gradient flow (WGF)
We propose a new constraint-free function approximation method using neural networks to numerically realize our algorithm.
arXiv Detail & Related papers (2022-07-17T04:05:32Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Variational Refinement for Importance Sampling Using the Forward
Kullback-Leibler Divergence [77.06203118175335]
Variational Inference (VI) is a popular alternative to exact sampling in Bayesian inference.
Importance sampling (IS) is often used to fine-tune and de-bias the estimates of approximate Bayesian inference procedures.
We propose a novel combination of optimization and sampling techniques for approximate Bayesian inference.
arXiv Detail & Related papers (2021-06-30T11:00:24Z) - Stochastic Normalizing Flows [52.92110730286403]
We introduce normalizing flows for maximum likelihood estimation and variational inference (VI) using differential equations (SDEs)
Using the theory of rough paths, the underlying Brownian motion is treated as a latent variable and approximated, enabling efficient training of neural SDEs.
These SDEs can be used for constructing efficient chains to sample from the underlying distribution of a given dataset.
arXiv Detail & Related papers (2020-02-21T20:47:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.