Related papers: Bayesian Model Selection via Mean-Field Variational Approximation

Bayesian Model Selection via Mean-Field Variational Approximation

URL: http://arxiv.org/abs/2312.10607v1
Date: Sun, 17 Dec 2023 04:48:25 GMT
Title: Bayesian Model Selection via Mean-Field Variational Approximation
Authors: Yangfan Zhang, Yun Yang
Abstract summary: We study the non-asymptotic properties of mean-field (MF) inference under the Bayesian framework. We show a Bernstein von-Mises (BvM) theorem for the variational distribution from MF under possible model mis-specification.
Score: 10.433170683584994
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This article considers Bayesian model selection via mean-field (MF) variational approximation. Towards this goal, we study the non-asymptotic properties of MF inference under the Bayesian framework that allows latent variables and model mis-specification. Concretely, we show a Bernstein von-Mises (BvM) theorem for the variational distribution from MF under possible model mis-specification, which implies the distributional convergence of MF variational approximation to a normal distribution centering at the maximal likelihood estimator (within the specified model). Motivated by the BvM theorem, we propose a model selection criterion using the evidence lower bound (ELBO), and demonstrate that the model selected by ELBO tends to asymptotically agree with the one selected by the commonly used Bayesian information criterion (BIC) as sample size tends to infinity. Comparing to BIC, ELBO tends to incur smaller approximation error to the log-marginal likelihood (a.k.a. model evidence) due to a better dimension dependence and full incorporation of the prior information. Moreover, we show the geometric convergence of the coordinate ascent variational inference (CAVI) algorithm under the parametric model framework, which provides a practical guidance on how many iterations one typically needs to run when approximating the ELBO. These findings demonstrate that variational inference is capable of providing a computationally efficient alternative to conventional approaches in tasks beyond obtaining point estimates, which is also empirically demonstrated by our extensive numerical experiments.

Related papers

Stability of Mean-Field Variational Inference [3.5729687931166136]
Mean-field inference (MFVI) is a widely used method for approxing high-dimensional probability distributions by product measures.<n>We show that the MFVI depends differentiably on the target potential and characterize the derivative by a partial differential equation.
arXiv Detail & Related papers (2025-06-09T15:21:37Z)
Variational Inference for Latent Variable Models in High Dimensions [4.3012765978447565]
We introduce a general framework for quantifying the statistical accuracy of mean-field variational inference (MFVI)<n>We capture the exact regime where MFVI 'works' for the celebrated latent Dirichlet allocation model.<n>Our proof techniques, which extend the framework of nonlinear large deviations, open the door for the analysis of MFVI in other latent variable models.
arXiv Detail & Related papers (2025-06-02T17:19:58Z)
Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective. The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning. The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z)
Continuous Bayesian Model Selection for Multivariate Causal Discovery [22.945274948173182]
Current causal discovery approaches require restrictive model assumptions or assume access to interventional data to ensure structure identifiability. Recent work has shown that Bayesian model selection can greatly improve accuracy by exchanging restrictive modelling for more flexible assumptions. We demonstrate the competitiveness of our approach on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-11-15T12:55:05Z)
Stochastic Sampling from Deterministic Flow Models [8.849981177332594]
We present a method to turn flow models into a family of differential equations (SDEs) that have the same marginal distributions. We empirically demonstrate advantages of our method on a toy Gaussian setup and on the large scale ImageNet generation task.
arXiv Detail & Related papers (2024-10-03T05:18:28Z)
Flow matching achieves almost minimax optimal convergence [50.38891696297888]
Flow matching (FM) has gained significant attention as a simulation-free generative model. This paper discusses the convergence properties of FM for large sample size under the $p$-Wasserstein distance. We establish that FM can achieve an almost minimax optimal convergence rate for $1 leq p leq 2$, presenting the first theoretical evidence that FM can reach convergence rates comparable to those of diffusion models.
arXiv Detail & Related papers (2024-05-31T14:54:51Z)
Estimating the Number of Components in Finite Mixture Models via Variational Approximation [8.468023518807408]
We introduce a new method for selecting the number of components in finite mixture models (FMMs) using variational Bayes. We establish matching upper and lower bounds for the Evidence Lower Bound (ELBO) derived from mean-field (MF) variational approximation. As a by-product of our proof, we demonstrate that the MF approximation inherits the stable behavior (benefited from model singularity) of the posterior distribution.
arXiv Detail & Related papers (2024-04-25T17:00:24Z)
Diffusion models for probabilistic programming [56.47577824219207]
Diffusion Model Variational Inference (DMVI) is a novel method for automated approximate inference in probabilistic programming languages (PPLs) DMVI is easy to implement, allows hassle-free inference in PPLs without the drawbacks of, e.g., variational inference using normalizing flows, and does not make any constraints on the underlying neural network model.
arXiv Detail & Related papers (2023-11-01T12:17:05Z)
Bivariate Causal Discovery using Bayesian Model Selection [11.726586969589]
We show how to incorporate causal assumptions within the Bayesian framework. This enables us to construct models with realistic assumptions. We then outperform previous methods on a wide range of benchmark datasets.
arXiv Detail & Related papers (2023-06-05T14:51:05Z)
Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region. Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z)
Probabilistic Circuits for Variational Inference in Discrete Graphical Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult. Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO) We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN) We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z)
Control as Hybrid Inference [62.997667081978825]
We present an implementation of CHI which naturally mediates the balance between iterative and amortised inference. We verify the scalability of our algorithm on a continuous control benchmark, demonstrating that it outperforms strong model-free and model-based baselines.
arXiv Detail & Related papers (2020-07-11T19:44:09Z)
Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions. Motivated by these theoretical results, we propose learning several approximate proposals for the best model. In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.