Variational inference via radial transport
- URL: http://arxiv.org/abs/2602.17525v1
- Date: Thu, 19 Feb 2026 16:36:52 GMT
- Title: Variational inference via radial transport
- Authors: Luca Ghafourpour, Sinho Chewi, Alessio Figalli, Aram-Alexandre Pooladian,
- Abstract summary: In variational inference (VI), the practitioner approximates a high-dimensional distribution $$ with a simple surrogate one, often a (product) Gaussian distribution.<n>In this work, we approach the VI problem from the perspective of optimizing over these radial profiles.<n>Our algorithm radVI is a cheap, effective add-on to many existing VI schemes, such as Gaussian (mean-field) VI and Laplace approximation.
- Score: 13.100339469711466
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In variational inference (VI), the practitioner approximates a high-dimensional distribution $π$ with a simple surrogate one, often a (product) Gaussian distribution. However, in many cases of practical interest, Gaussian distributions might not capture the correct radial profile of $π$, resulting in poor coverage. In this work, we approach the VI problem from the perspective of optimizing over these radial profiles. Our algorithm radVI is a cheap, effective add-on to many existing VI schemes, such as Gaussian (mean-field) VI and Laplace approximation. We provide theoretical convergence guarantees for our algorithm, owing to recent developments in optimization over the Wasserstein space--the space of probability distributions endowed with the Wasserstein distance--and new regularity properties of radial transport maps in the style of Caffarelli (2000).
Related papers
- Relative Wasserstein Angle and the Problem of the $W_2$-Nearest Gaussian Distribution [4.042425236692822]
We study the problem of quantifying how far an empirical distribution deviates from Gaussianity under the framework of optimal transport.<n>By exploiting the cone geometry of the relative translation invariant quadratic Wasserstein space, we introduce two novel geometric quantities.<n>We prove that the filling cone generated by any two rays in this space is flat, ensuring angles, projections, and inner products are rigorously well-defined.
arXiv Detail & Related papers (2026-01-29T22:03:10Z) - Optimal Transportation and Alignment Between Gaussian Measures [80.4634530260329]
Optimal transport (OT) and Gromov-Wasserstein (GW) alignment provide interpretable geometric frameworks for datasets.<n>Because these frameworks are computationally expensive, large-scale applications often rely on closed-form solutions for Gaussian distributions under quadratic cost.<n>This work provides a comprehensive treatment of Gaussian, quadratic cost OT and inner product GW (IGW) alignment, closing several gaps in the literature to broaden applicability.
arXiv Detail & Related papers (2025-12-03T09:01:48Z) - On the Wasserstein Convergence and Straightness of Rectified Flow [54.580605276017096]
Rectified Flow (RF) is a generative model that aims to learn straight flow trajectories from noise to data.<n>We provide a theoretical analysis of the Wasserstein distance between the sampling distribution of RF and the target distribution.<n>We present general conditions guaranteeing uniqueness and straightness of 1-RF, which is in line with previous empirical findings.
arXiv Detail & Related papers (2024-10-19T02:36:11Z) - Extending Mean-Field Variational Inference via Entropic Regularization: Theory and Computation [3.1048285287226904]
Variational inference (VI) has emerged as a popular method for approximate inference for high-dimensional Bayesian models.<n>We propose a novel VI method that extends the naive mean field via entropic regularization.<n>We show that $Xi$-variational posteriors effectively recover the true posterior dependency.
arXiv Detail & Related papers (2024-04-14T01:40:11Z) - Forward-backward Gaussian variational inference via JKO in the
Bures-Wasserstein Space [19.19325201882727]
Variational inference (VI) seeks to approximate a target distribution $pi$ by an element of a tractable family of distributions.
We develop the Forward-Backward Gaussian Variational Inference (FB-GVI) algorithm to solve Gaussian VI.
For our proposed algorithm, we obtain state-of-the-art convergence guarantees when $pi$ is log-smooth and log-concave.
arXiv Detail & Related papers (2023-04-10T19:49:50Z) - Robust computation of optimal transport by $\beta$-potential
regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions.
We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence.
We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z) - Sliced Wasserstein Variational Inference [3.405431122165563]
We propose a new variational inference method by minimizing sliced Wasserstein distance, a valid metric arising from optimal transport.
Our approximation also does not require a tractable density function of variational distributions so that approximating families can be amortized by generators like neural networks.
arXiv Detail & Related papers (2022-07-26T20:51:51Z) - A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space.
We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense.
We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Distributional Sliced Embedding Discrepancy for Incomparable
Distributions [22.615156512223766]
Gromov-Wasserstein (GW) distance is a key tool for manifold learning and cross-domain learning.
We propose a novel approach for comparing two computation distributions, that hinges on the idea of distributional slicing, embeddings, and on computing the closed-form Wasserstein distance between the sliced distributions.
arXiv Detail & Related papers (2021-06-04T15:11:30Z) - Linear Optimal Transport Embedding: Provable Wasserstein classification
for certain rigid transformations and perturbations [79.23797234241471]
Discriminating between distributions is an important problem in a number of scientific fields.
The Linear Optimal Transportation (LOT) embeds the space of distributions into an $L2$-space.
We demonstrate the benefits of LOT on a number of distribution classification problems.
arXiv Detail & Related papers (2020-08-20T19:09:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.