Mean-field Variational Inference via Wasserstein Gradient Flow
- URL: http://arxiv.org/abs/2207.08074v2
- Date: Fri, 8 Sep 2023 04:43:46 GMT
- Title: Mean-field Variational Inference via Wasserstein Gradient Flow
- Authors: Rentian Yao, Yun Yang
- Abstract summary: Variational inference, such as the mean-field (MF) approximation, requires certain conjugacy structures for efficient computation.
We introduce a general computational framework to implement MFal inference for Bayesian models, with or without latent variables, using the Wasserstein gradient flow (WGF)
We propose a new constraint-free function approximation method using neural networks to numerically realize our algorithm.
- Score: 8.05603983337769
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational inference, such as the mean-field (MF) approximation, requires
certain conjugacy structures for efficient computation. These can impose
unnecessary restrictions on the viable prior distribution family and further
constraints on the variational approximation family. In this work, we introduce
a general computational framework to implement MF variational inference for
Bayesian models, with or without latent variables, using the Wasserstein
gradient flow (WGF), a modern mathematical technique for realizing a gradient
flow over the space of probability measures. Theoretically, we analyze the
algorithmic convergence of the proposed approaches, providing an explicit
expression for the contraction factor. We also strengthen existing results on
MF variational posterior concentration from a polynomial to an exponential
contraction, by utilizing the fixed point equation of the time-discretized WGF.
Computationally, we propose a new constraint-free function approximation method
using neural networks to numerically realize our algorithm. This method is
shown to be more precise and efficient than traditional particle approximation
methods based on Langevin dynamics.
Related papers
- Moreau Envelope ADMM for Decentralized Weakly Convex Optimization [55.2289666758254]
This paper proposes a proximal variant of the alternating direction method of multipliers (ADMM) for distributed optimization.
The results of our numerical experiments indicate that our method is faster and more robust than widely-used approaches.
arXiv Detail & Related papers (2023-08-31T14:16:30Z) - Particle Mean Field Variational Bayes [3.4355075318742165]
Mean Field Variational Bayes (MFVB) is one of the most computationally efficient techniques for Bayesian inference.
This paper proposes a novel particle-based MFVB approach that greatly expands the applicability of the MFVB method.
arXiv Detail & Related papers (2023-03-24T11:38:35Z) - Interacting Particle Langevin Algorithm for Maximum Marginal Likelihood
Estimation [2.53740603524637]
We develop a class of interacting particle systems for implementing a maximum marginal likelihood estimation procedure.
In particular, we prove that the parameter marginal of the stationary measure of this diffusion has the form of a Gibbs measure.
Using a particular rescaling, we then prove geometric ergodicity of this system and bound the discretisation error.
in a manner that is uniform in time and does not increase with the number of particles.
arXiv Detail & Related papers (2023-03-23T16:50:08Z) - D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory [79.50644650795012]
We propose a deep learning approach to solve Kohn-Sham Density Functional Theory (KS-DFT)
We prove that such an approach has the same expressivity as the SCF method, yet reduces the computational complexity.
In addition, we show that our approach enables us to explore more complex neural-based wave functions.
arXiv Detail & Related papers (2023-03-01T10:38:10Z) - Gradient Flows for Sampling: Mean-Field Models, Gaussian Approximations and Affine Invariance [10.153270126742369]
We study gradient flows in both probability density space and Gaussian space.
The flow in the Gaussian space may be understood as a Gaussian approximation of the flow.
arXiv Detail & Related papers (2023-02-21T21:44:08Z) - Aspects of scaling and scalability for flow-based sampling of lattice
QCD [137.23107300589385]
Recent applications of machine-learned normalizing flows to sampling in lattice field theory suggest that such methods may be able to mitigate critical slowing down and topological freezing.
It remains to be determined whether they can be applied to state-of-the-art lattice quantum chromodynamics calculations.
arXiv Detail & Related papers (2022-11-14T17:07:37Z) - Sampling with Mollified Interaction Energy Descent [57.00583139477843]
We present a new optimization-based method for sampling called mollified interaction energy descent (MIED)
MIED minimizes a new class of energies on probability measures called mollified interaction energies (MIEs)
We show experimentally that for unconstrained sampling problems our algorithm performs on par with existing particle-based algorithms like SVGD.
arXiv Detail & Related papers (2022-10-24T16:54:18Z) - On Representations of Mean-Field Variational Inference [2.4316550366482357]
We present a framework to analyze mean field variational inference (MFVI) algorithms.
Our approach enables the MFVI problem to be represented in three different manners.
Rigorous guarantees are established to show that a time-discretized implementation of the coordinate ascent variational inference algorithm yields a gradient flow in the limit.
arXiv Detail & Related papers (2022-10-20T16:26:22Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - An application of the splitting-up method for the computation of a
neural network representation for the solution for the filtering equations [68.8204255655161]
Filtering equations play a central role in many real-life applications, including numerical weather prediction, finance and engineering.
One of the classical approaches to approximate the solution of the filtering equations is to use a PDE inspired method, called the splitting-up method.
We combine this method with a neural network representation to produce an approximation of the unnormalised conditional distribution of the signal process.
arXiv Detail & Related papers (2022-01-10T11:01:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.