Particle Semi-Implicit Variational Inference
- URL: http://arxiv.org/abs/2407.00649v2
- Date: Wed, 30 Oct 2024 13:18:41 GMT
- Title: Particle Semi-Implicit Variational Inference
- Authors: Jen Ning Lim, Adam M. Johansen,
- Abstract summary: Semi-implicit variational inference (SIVI) enriches the expressiveness of variational families by utilizing a kernel and a mixing distribution.
Existing SIVI methods parameterize the mixing distribution using implicit distributions, leading to intractable variational densities.
We propose a novel method for SIVI called Particle Variational Inference (PVI) which employs empirical measures to approximate the optimal mixing distributions.
- Score: 2.555222031881788
- License:
- Abstract: Semi-implicit variational inference (SIVI) enriches the expressiveness of variational families by utilizing a kernel and a mixing distribution to hierarchically define the variational distribution. Existing SIVI methods parameterize the mixing distribution using implicit distributions, leading to intractable variational densities. As a result, directly maximizing the evidence lower bound (ELBO) is not possible, so they resort to one of the following: optimizing bounds on the ELBO, employing costly inner-loop Markov chain Monte Carlo runs, or solving minimax objectives. In this paper, we propose a novel method for SIVI called Particle Variational Inference (PVI) which employs empirical measures to approximate the optimal mixing distributions characterized as the minimizer of a free energy functional. PVI arises naturally as a particle approximation of a Euclidean--Wasserstein gradient flow and, unlike prior works, it directly optimizes the ELBO whilst making no parametric assumption about the mixing distribution. Our empirical results demonstrate that PVI performs favourably compared to other SIVI methods across various tasks. Moreover, we provide a theoretical analysis of the behaviour of the gradient flow of a related free energy functional: establishing the existence and uniqueness of solutions as well as propagation of chaos results.
Related papers
- Semi-Implicit Functional Gradient Flow [30.32233517392456]
We propose a functional gradient ParVI method that uses perturbed particles as the approximation family.
The corresponding functional gradient flow, which can be estimated via denoising score matching, exhibits strong theoretical convergence guarantee.
arXiv Detail & Related papers (2024-10-23T15:00:30Z) - NETS: A Non-Equilibrium Transport Sampler [15.58993313831079]
We propose an algorithm, termed the Non-Equilibrium Transport Sampler (NETS)
NETS can be viewed as a variant of importance sampling (AIS) based on Jarzynski's equality.
We show that this drift is the minimizer of a variety of objective functions, which can all be estimated in an unbiased fashion.
arXiv Detail & Related papers (2024-10-03T17:35:38Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Semi-Implicit Variational Inference via Score Matching [9.654640791869431]
Semi-implicit variational inference (SIVI) greatly enriches the expressiveness of variational families.
Current SIVI approaches often use surrogate evidence lower bounds (ELBOs) or employ expensive inner-loop MCMC runs for unbiased ELBOs for training.
We propose SIVI-SM, a new method for SIVI based on an alternative training objective via score matching.
arXiv Detail & Related papers (2023-08-19T13:32:54Z) - Sampling with Mollified Interaction Energy Descent [57.00583139477843]
We present a new optimization-based method for sampling called mollified interaction energy descent (MIED)
MIED minimizes a new class of energies on probability measures called mollified interaction energies (MIEs)
We show experimentally that for unconstrained sampling problems our algorithm performs on par with existing particle-based algorithms like SVGD.
arXiv Detail & Related papers (2022-10-24T16:54:18Z) - Interpolating between sampling and variational inference with infinite
stochastic mixtures [6.021787236982659]
Sampling and Variational Inference (VI) are two large families of methods for approximate inference with complementary strengths.
Here, we develop a framework for constructing intermediate algorithms that balance the strengths of both sampling and VI.
This work is a first step towards a highly flexible yet simple family of inference methods that combines the complementary strengths of sampling and VI.
arXiv Detail & Related papers (2021-10-18T20:50:06Z) - Loss function based second-order Jensen inequality and its application
to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution.
PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models.
We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z) - Generative Particle Variational Inference via Estimation of Functional
Gradients [15.370890881254066]
This work proposes a new method for learning to approximately sample from the posterior distribution.
Our generative ParVI (GPVI) approach maintains the performance of ParVI methods while offering the flexibility of a generative sampler.
arXiv Detail & Related papers (2021-03-01T20:29:41Z) - Efficient Semi-Implicit Variational Inference [65.07058307271329]
We propose an efficient and scalable semi-implicit extrapolational (SIVI)
Our method maps SIVI's evidence to a rigorous inference of lower gradient values.
arXiv Detail & Related papers (2021-01-15T11:39:09Z) - Variational Transport: A Convergent Particle-BasedAlgorithm for Distributional Optimization [106.70006655990176]
A distributional optimization problem arises widely in machine learning and statistics.
We propose a novel particle-based algorithm, dubbed as variational transport, which approximately performs Wasserstein gradient descent.
We prove that when the objective function satisfies a functional version of the Polyak-Lojasiewicz (PL) (Polyak, 1963) and smoothness conditions, variational transport converges linearly.
arXiv Detail & Related papers (2020-12-21T18:33:13Z) - Stochastic Normalizing Flows [52.92110730286403]
We introduce normalizing flows for maximum likelihood estimation and variational inference (VI) using differential equations (SDEs)
Using the theory of rough paths, the underlying Brownian motion is treated as a latent variable and approximated, enabling efficient training of neural SDEs.
These SDEs can be used for constructing efficient chains to sample from the underlying distribution of a given dataset.
arXiv Detail & Related papers (2020-02-21T20:47:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.