Related papers: Natural Gradient VI: Guarantees for Non-Conjugate Models

Natural Gradient VI: Guarantees for Non-Conjugate Models

URL: http://arxiv.org/abs/2510.19163v1
Date: Wed, 22 Oct 2025 01:46:31 GMT
Title: Natural Gradient VI: Guarantees for Non-Conjugate Models
Authors: Fangyuan Sun, Ilyas Fatkhullin, Niao He,
Abstract summary: Natural Natural Variational Inference (NGVI) is a widely used method for approxingimating posterior distribution in probabilistic models.<n>We show that NGVI convergence guarantees do not extend to the non-cliconjugate setting.<n>We propose a modified NGVI algorithm incorporating non-Euference and prove its global non-asymptotic convergence to a stationary point.
Score: 29.717497256432186
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Stochastic Natural Gradient Variational Inference (NGVI) is a widely used method for approximating posterior distribution in probabilistic models. Despite its empirical success and foundational role in variational inference, its theoretical underpinnings remain limited, particularly in the case of non-conjugate likelihoods. While NGVI has been shown to be a special instance of Stochastic Mirror Descent, and recent work has provided convergence guarantees using relative smoothness and strong convexity for conjugate models, these results do not extend to the non-conjugate setting, where the variational loss becomes non-convex and harder to analyze. In this work, we focus on mean-field parameterization and advance the theoretical understanding of NGVI in three key directions. First, we derive sufficient conditions under which the variational loss satisfies relative smoothness with respect to a suitable mirror map. Second, leveraging this structure, we propose a modified NGVI algorithm incorporating non-Euclidean projections and prove its global non-asymptotic convergence to a stationary point. Finally, under additional structural assumptions about the likelihood, we uncover hidden convexity properties of the variational loss and establish fast global convergence of NGVI to a global optimum. These results provide new insights into the geometry and convergence behavior of NGVI in challenging inference settings.

Related papers

Scalable Mean-Field Variational Inference via Preconditioned Primal-Dual Optimization [7.193011407502236]
We develop a novel primal-dual algorithm based on an augmented Lagrangian formulation, termed primal-dual variational inference (PD-VI)<n>PD-VI jointly updates global and local variational parameters in the evidence lower bound in a scalable manner.<n>We establish convergence guarantees for both PD-VI and P$2$D-VI under properly chosen constant step size.
arXiv Detail & Related papers (2026-02-07T17:24:15Z)
From Tail Universality to Bernstein-von Mises: A Unified Statistical Theory of Semi-Implicit Variational Inference [0.12183405753834557]
Semi-implicit variational inference (SIVI) constructs approximate posteriors of the form $q() = int k(| z) r(dz)$<n>This paper develops a unified "approximation-optimization-statistics'' theory for such families.
arXiv Detail & Related papers (2025-12-05T19:26:25Z)
Neural Optimal Transport Meets Multivariate Conformal Prediction [58.43397908730771]
We propose a framework for conditional vectorile regression (CVQR)<n>CVQR combines neural optimal transport with quantized optimization, and apply it to predictions.
arXiv Detail & Related papers (2025-09-29T19:50:19Z)
Asymptotics of Non-Convex Generalized Linear Models in High-Dimensions: A proof of the replica formula [17.036996839737828]
We show how an algorithm can be used to prove the optimality of a non-dimensional Gaussian regularization model.<n>We also show how we can use the Tukey loss to prove the optimality of a negative regularization model.
arXiv Detail & Related papers (2025-02-27T11:29:43Z)
Federated Generalised Variational Inference: A Robust Probabilistic Federated Learning Framework [12.454538785810259]
FedGVI is a probabilistic Federated Learning (FL) framework that is robust to both prior and likelihood misspecification.<n>We offer theoretical analysis in terms of fixed-point convergence, optimality of the cavity distribution, and provable robustness to likelihood misspecification.
arXiv Detail & Related papers (2025-02-02T16:39:37Z)
A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning. This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z)
Curvature-Independent Last-Iterate Convergence for Games on Riemannian Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate. To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z)
Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference. We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z)
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO. We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size. Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z)
Loss function based second-order Jensen inequality and its application to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution. PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models. We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z)
Efficient Semi-Implicit Variational Inference [65.07058307271329]
We propose an efficient and scalable semi-implicit extrapolational (SIVI) Our method maps SIVI's evidence to a rigorous inference of lower gradient values.
arXiv Detail & Related papers (2021-01-15T11:39:09Z)
Statistical Guarantees for Transformation Based Models with Applications to Implicit Variational Inference [8.333191406788423]
We provide theoretical justification for the use of non-linear latent variable models (NL-LVMs) in non-parametric inference. We use the NL-LVMs to construct an implicit family of variational distributions, deemed GP-IVI. To the best of our knowledge, this is the first work on providing theoretical guarantees for implicit variational inference.
arXiv Detail & Related papers (2020-10-23T21:06:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.