Probabilistic Circuits for Variational Inference in Discrete Graphical
Models
- URL: http://arxiv.org/abs/2010.11446v1
- Date: Thu, 22 Oct 2020 05:04:38 GMT
- Title: Probabilistic Circuits for Variational Inference in Discrete Graphical
Models
- Authors: Andy Shih, Stefano Ermon
- Abstract summary: Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
- Score: 101.28528515775842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Inference in discrete graphical models with variational methods is difficult
because of the inability to re-parameterize gradients of the Evidence Lower
Bound (ELBO). Many sampling-based methods have been proposed for estimating
these gradients, but they suffer from high bias or variance. In this paper, we
propose a new approach that leverages the tractability of probabilistic circuit
models, such as Sum Product Networks (SPN), to compute ELBO gradients exactly
(without sampling) for a certain class of densities. In particular, we show
that selective-SPNs are suitable as an expressive variational distribution, and
prove that when the log-density of the target model is a polynomial the
corresponding ELBO can be computed analytically. To scale to graphical models
with thousands of variables, we develop an efficient and effective construction
of selective-SPNs with size $O(kn)$, where $n$ is the number of variables and
$k$ is an adjustable hyperparameter. We demonstrate our approach on three types
of graphical models -- Ising models, Latent Dirichlet Allocation, and factor
graphs from the UAI Inference Competition. Selective-SPNs give a better lower
bound than mean-field and structured mean-field, and is competitive with
approximations that do not provide a lower bound, such as Loopy Belief
Propagation and Tree-Reweighted Belief Propagation. Our results show that
probabilistic circuits are promising tools for variational inference in
discrete graphical models as they combine tractability and expressivity.
Related papers
- Proximal Interacting Particle Langevin Algorithms [0.0]
We introduce Proximal Interacting Particle Langevin Algorithms (PIPLA) for inference and learning in latent variable models.
We propose several variants within the novel proximal IPLA family, tailored to the problem of estimating parameters in a non-differentiable statistical model.
Our theory and experiments together show that PIPLA family can be the de facto choice for parameter estimation problems in latent variable models for non-differentiable models.
arXiv Detail & Related papers (2024-06-20T13:16:41Z) - On the Trajectory Regularity of ODE-based Diffusion Sampling [79.17334230868693]
Diffusion-based generative models use differential equations to establish a smooth connection between a complex data distribution and a tractable prior distribution.
In this paper, we identify several intriguing trajectory properties in the ODE-based sampling process of diffusion models.
arXiv Detail & Related papers (2024-05-18T15:59:41Z) - Neural Gaussian Similarity Modeling for Differential Graph Structure
Learning [24.582257964387402]
We construct a differential graph structure learning model by replacing the non-differentiable nearest neighbor sampling with a differentiable sampling.
To alleviate this issue, the bell-shaped Gaussian Similarity (GauSim) modeling is proposed to sample non-nearest neighbors.
We develop a scalable method by transferring the large-scale graph to the transition graph to significantly reduce the complexity.
arXiv Detail & Related papers (2023-12-15T02:45:33Z) - Matching Normalizing Flows and Probability Paths on Manifolds [57.95251557443005]
Continuous Normalizing Flows (CNFs) are generative models that transform a prior distribution to a model distribution by solving an ordinary differential equation (ODE)
We propose to train CNFs by minimizing probability path divergence (PPD), a novel family of divergences between the probability density path generated by the CNF and a target probability density path.
We show that CNFs learned by minimizing PPD achieve state-of-the-art results in likelihoods and sample quality on existing low-dimensional manifold benchmarks.
arXiv Detail & Related papers (2022-07-11T08:50:19Z) - Optimizing differential equations to fit data and predict outcomes [0.0]
Recent technical advances in automatic differentiation through numerical differential equation solvers potentially change the fitting process into a relatively easy problem.
This article illustrates how to overcome a variety of common challenges, using the classic ecological data for oscillations in hare and lynx populations.
arXiv Detail & Related papers (2022-04-16T16:08:08Z) - Score-based Generative Modeling of Graphs via the System of Stochastic
Differential Equations [57.15855198512551]
We propose a novel score-based generative model for graphs with a continuous-time framework.
We show that our method is able to generate molecules that lie close to the training distribution yet do not violate the chemical valency rule.
arXiv Detail & Related papers (2022-02-05T08:21:04Z) - Scalable mixed-domain Gaussian process modeling and model reduction for longitudinal data [5.00301731167245]
We derive a basis function approximation scheme for mixed-domain covariance functions.
We show that we can approximate the exact GP model accurately in a fraction of the runtime.
We also demonstrate a scalable model reduction workflow for obtaining smaller and more interpretable models.
arXiv Detail & Related papers (2021-11-03T04:47:37Z) - Gaussian Process Latent Class Choice Models [7.992550355579791]
We present a non-parametric class of probabilistic machine learning within discrete choice models (DCMs)
The proposed model would assign individuals probabilistically to behaviorally homogeneous clusters (latent classes) using GPs.
The model is tested on two different mode choice applications and compared against different LCCM benchmarks.
arXiv Detail & Related papers (2021-01-28T19:56:42Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Stein Variational Inference for Discrete Distributions [70.19352762933259]
We propose a simple yet general framework that transforms discrete distributions to equivalent piecewise continuous distributions.
Our method outperforms traditional algorithms such as Gibbs sampling and discontinuous Hamiltonian Monte Carlo.
We demonstrate that our method provides a promising tool for learning ensembles of binarized neural network (BNN)
In addition, such transform can be straightforwardly employed in gradient-free kernelized Stein discrepancy to perform goodness-of-fit (GOF) test on discrete distributions.
arXiv Detail & Related papers (2020-03-01T22:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.