f-Divergence Variational Inference
- URL: http://arxiv.org/abs/2009.13093v4
- Date: Sat, 3 Apr 2021 16:33:25 GMT
- Title: f-Divergence Variational Inference
- Authors: Neng Wan, Dapeng Li, and Naira Hovakimyan
- Abstract summary: The $f$-VI framework unifies a number of existing VI methods.
A general $f$-variational bound is derived and provides a sandwich estimate of marginal likelihood (or evidence)
A mean-field approximation scheme that generalizes the well-known coordinate ascent variational inference is also proposed for $f$-VI.
- Score: 9.172478956440216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper introduces the $f$-divergence variational inference ($f$-VI) that
generalizes variational inference to all $f$-divergences. Initiated from
minimizing a crafty surrogate $f$-divergence that shares the statistical
consistency with the $f$-divergence, the $f$-VI framework not only unifies a
number of existing VI methods, e.g. Kullback-Leibler VI, R\'{e}nyi's
$\alpha$-VI, and $\chi$-VI, but offers a standardized toolkit for VI subject to
arbitrary divergences from $f$-divergence family. A general $f$-variational
bound is derived and provides a sandwich estimate of marginal likelihood (or
evidence). The development of the $f$-VI unfolds with a stochastic optimization
scheme that utilizes the reparameterization trick, importance weighting and
Monte Carlo approximation; a mean-field approximation scheme that generalizes
the well-known coordinate ascent variational inference (CAVI) is also proposed
for $f$-VI. Empirical examples, including variational autoencoders and Bayesian
neural networks, are provided to demonstrate the effectiveness and the wide
applicability of $f$-VI.
Related papers
- $f$-PO: Generalizing Preference Optimization with $f$-divergence Minimization [91.43730624072226]
$f$-PO is a novel framework that generalizes and extends existing approaches.
We conduct experiments on state-of-the-art language models using benchmark datasets.
arXiv Detail & Related papers (2024-10-29T02:11:45Z) - Theoretical Convergence Guarantees for Variational Autoencoders [2.8167997311962942]
Variational Autoencoders (VAE) are popular generative models used to sample from complex data distributions.
This paper aims to bridge that gap by providing non-asymptotic convergence guarantees for VAE trained using both Gradient Descent and Adam algorithms.
Our theoretical analysis applies to both Linear VAE and Deep Gaussian VAE, as well as several VAE variants, including $beta$-VAE and IWAE.
arXiv Detail & Related papers (2024-10-22T07:12:38Z) - Extending Mean-Field Variational Inference via Entropic Regularization: Theory and Computation [2.2656885622116394]
Variational inference (VI) has emerged as a popular method for approximate inference for high-dimensional Bayesian models.
We propose a novel VI method that extends the naive mean field via entropic regularization.
We show that $Xi$-variational posteriors effectively recover the true posterior dependency.
arXiv Detail & Related papers (2024-04-14T01:40:11Z) - Equivalence of the Empirical Risk Minimization to Regularization on the Family of f-Divergences [45.935798913942904]
The solution to empirical risk minimization with $f$-divergence regularization (ERM-$f$DR) is presented.
Examples of the solution for particular choices of the function $f$ are presented.
arXiv Detail & Related papers (2024-02-01T11:12:00Z) - Universal Online Learning with Gradient Variations: A Multi-layer Online Ensemble Approach [57.92727189589498]
We propose an online convex optimization approach with two different levels of adaptivity.
We obtain $mathcalO(log V_T)$, $mathcalO(d log V_T)$ and $hatmathcalO(sqrtV_T)$ regret bounds for strongly convex, exp-concave and convex loss functions.
arXiv Detail & Related papers (2023-07-17T09:55:35Z) - Solving Constrained Variational Inequalities via an Interior Point
Method [88.39091990656107]
We develop an interior-point approach to solve constrained variational inequality (cVI) problems.
We provide convergence guarantees for ACVI in two general classes of problems.
Unlike previous work in this setting, ACVI provides a means to solve cVIs when the constraints are nontrivial.
arXiv Detail & Related papers (2022-06-21T17:55:13Z) - Moreau-Yosida $f$-divergences [0.0]
Variational representations of $f$-divergences are central to many machine learning algorithms.
We generalize the so-called tight variational representation of $f$-divergences in the case of probability measures on compact metric spaces.
We provide an implementation of the variational formulas for the Kullback-Leibler, reverse Kullback-Leibler, $chi2$, reverse $chi2$, squared Hellinger, Jensen-Shannon, Jeffreys, triangular discrimination and total variation divergences.
arXiv Detail & Related papers (2021-02-26T11:46:10Z) - Efficient Semi-Implicit Variational Inference [65.07058307271329]
We propose an efficient and scalable semi-implicit extrapolational (SIVI)
Our method maps SIVI's evidence to a rigorous inference of lower gradient values.
arXiv Detail & Related papers (2021-01-15T11:39:09Z) - Meta-Learning Divergences of Variational Inference [49.164944557174294]
Variational inference (VI) plays an essential role in approximate Bayesian inference.
We propose a meta-learning algorithm to learn the divergence metric suited for the task of interest.
We demonstrate our approach outperforms standard VI on Gaussian mixture distribution approximation.
arXiv Detail & Related papers (2020-07-06T17:43:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.