Related papers: Variational inference via radial transport

Variational inference via radial transport

URL: http://arxiv.org/abs/2602.17525v1
Date: Thu, 19 Feb 2026 16:36:52 GMT
Title: Variational inference via radial transport
Authors: Luca Ghafourpour, Sinho Chewi, Alessio Figalli, Aram-Alexandre Pooladian,
Abstract summary: In variational inference (VI), the practitioner approximates a high-dimensional distribution $$ with a simple surrogate one, often a (product) Gaussian distribution.<n>In this work, we approach the VI problem from the perspective of optimizing over these radial profiles.<n>Our algorithm radVI is a cheap, effective add-on to many existing VI schemes, such as Gaussian (mean-field) VI and Laplace approximation.
Score: 13.100339469711466
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In variational inference (VI), the practitioner approximates a high-dimensional distribution $π$ with a simple surrogate one, often a (product) Gaussian distribution. However, in many cases of practical interest, Gaussian distributions might not capture the correct radial profile of $π$, resulting in poor coverage. In this work, we approach the VI problem from the perspective of optimizing over these radial profiles. Our algorithm radVI is a cheap, effective add-on to many existing VI schemes, such as Gaussian (mean-field) VI and Laplace approximation. We provide theoretical convergence guarantees for our algorithm, owing to recent developments in optimization over the Wasserstein space--the space of probability distributions endowed with the Wasserstein distance--and new regularity properties of radial transport maps in the style of Caffarelli (2000).

Related papers

Relative Wasserstein Angle and the Problem of the $W_2$-Nearest Gaussian Distribution [4.042425236692822]
We study the problem of quantifying how far an empirical distribution deviates from Gaussianity under the framework of optimal transport.<n>By exploiting the cone geometry of the relative translation invariant quadratic Wasserstein space, we introduce two novel geometric quantities.<n>We prove that the filling cone generated by any two rays in this space is flat, ensuring angles, projections, and inner products are rigorously well-defined.
arXiv Detail & Related papers (2026-01-29T22:03:10Z)
Optimal Transportation and Alignment Between Gaussian Measures [80.4634530260329]
Optimal transport (OT) and Gromov-Wasserstein (GW) alignment provide interpretable geometric frameworks for datasets.<n>Because these frameworks are computationally expensive, large-scale applications often rely on closed-form solutions for Gaussian distributions under quadratic cost.<n>This work provides a comprehensive treatment of Gaussian, quadratic cost OT and inner product GW (IGW) alignment, closing several gaps in the literature to broaden applicability.
arXiv Detail & Related papers (2025-12-03T09:01:48Z)
On the Wasserstein Convergence and Straightness of Rectified Flow [54.580605276017096]
Rectified Flow (RF) is a generative model that aims to learn straight flow trajectories from noise to data.<n>We provide a theoretical analysis of the Wasserstein distance between the sampling distribution of RF and the target distribution.<n>We present general conditions guaranteeing uniqueness and straightness of 1-RF, which is in line with previous empirical findings.
arXiv Detail & Related papers (2024-10-19T02:36:11Z)
Extending Mean-Field Variational Inference via Entropic Regularization: Theory and Computation [3.1048285287226904]
Variational inference (VI) has emerged as a popular method for approximate inference for high-dimensional Bayesian models.<n>We propose a novel VI method that extends the naive mean field via entropic regularization.<n>We show that $Xi$-variational posteriors effectively recover the true posterior dependency.
arXiv Detail & Related papers (2024-04-14T01:40:11Z)
Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein Space [19.19325201882727]
Variational inference (VI) seeks to approximate a target distribution $pi$ by an element of a tractable family of distributions. We develop the Forward-Backward Gaussian Variational Inference (FB-GVI) algorithm to solve Gaussian VI. For our proposed algorithm, we obtain state-of-the-art convergence guarantees when $pi$ is log-smooth and log-concave.
arXiv Detail & Related papers (2023-04-10T19:49:50Z)
Robust computation of optimal transport by $\beta$-potential regularization [79.24513412588745]
Optimal transport (OT) has become a widely used tool in the machine learning field to measure the discrepancy between probability distributions. We propose regularizing OT with the beta-potential term associated with the so-called $beta$-divergence. We experimentally demonstrate that the transport matrix computed with our algorithm helps estimate a probability distribution robustly even in the presence of outliers.
arXiv Detail & Related papers (2022-12-26T18:37:28Z)
Sliced Wasserstein Variational Inference [3.405431122165563]
We propose a new variational inference method by minimizing sliced Wasserstein distance, a valid metric arising from optimal transport. Our approximation also does not require a tractable density function of variational distributions so that approximating families can be amortized by generators like neural networks.
arXiv Detail & Related papers (2022-07-26T20:51:51Z)
A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space. We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense. We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z)
Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability. We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections. Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z)
Distributional Sliced Embedding Discrepancy for Incomparable Distributions [22.615156512223766]
Gromov-Wasserstein (GW) distance is a key tool for manifold learning and cross-domain learning. We propose a novel approach for comparing two computation distributions, that hinges on the idea of distributional slicing, embeddings, and on computing the closed-form Wasserstein distance between the sliced distributions.
arXiv Detail & Related papers (2021-06-04T15:11:30Z)
Linear Optimal Transport Embedding: Provable Wasserstein classification for certain rigid transformations and perturbations [79.23797234241471]
Discriminating between distributions is an important problem in a number of scientific fields. The Linear Optimal Transportation (LOT) embeds the space of distributions into an $L2$-space. We demonstrate the benefits of LOT on a number of distribution classification problems.
arXiv Detail & Related papers (2020-08-20T19:09:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.