Automatically Marginalized MCMC in Probabilistic Programming
- URL: http://arxiv.org/abs/2302.00564v2
- Date: Thu, 1 Jun 2023 18:52:46 GMT
- Title: Automatically Marginalized MCMC in Probabilistic Programming
- Authors: Jinlin Lai, Javier Burroni, Hui Guan, Daniel Sheldon
- Abstract summary: Hamiltonian Monte Carlo (HMC) is a powerful algorithm to sample latent variables from Bayesian models.
We propose to use automatic marginalization as part of the sampling process using HMC in a graphical model extracted from a PPL.
- Score: 12.421267523795114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hamiltonian Monte Carlo (HMC) is a powerful algorithm to sample latent
variables from Bayesian models. The advent of probabilistic programming
languages (PPLs) frees users from writing inference algorithms and lets users
focus on modeling. However, many models are difficult for HMC to solve
directly, and often require tricks like model reparameterization. We are
motivated by the fact that many of those models could be simplified by
marginalization. We propose to use automatic marginalization as part of the
sampling process using HMC in a graphical model extracted from a PPL, which
substantially improves sampling from real-world hierarchical models.
Related papers
- Hamiltonian Monte Carlo Inference of Marginalized Linear Mixed-Effects Models [26.534682620182707]
We develop an algorithm to easily marginalize random effects in linear mixed-effects models.
A naive approach introduces cubic time operations within an inference algorithm like Hamiltonian Monte Carlo (HMC)
We show that marginalization is always beneficial when applicable and highlight improvements in various models.
arXiv Detail & Related papers (2024-10-31T16:16:18Z) - EMR-Merging: Tuning-Free High-Performance Model Merging [55.03509900949149]
We show that Elect, Mask & Rescale-Merging (EMR-Merging) shows outstanding performance compared to existing merging methods.
EMR-Merging is tuning-free, thus requiring no data availability or any additional training while showing impressive performance.
arXiv Detail & Related papers (2024-05-23T05:25:45Z) - Generative Marginalization Models [21.971818180264943]
marginalization models (MAMs) are a new family of generative models for high-dimensional discrete data.
They offer scalable and flexible generative modeling by explicitly modeling all induced marginal distributions.
For energy-based training tasks, MAMs enable any-order generative modeling of high-dimensional problems beyond the scale of previous methods.
arXiv Detail & Related papers (2023-10-19T17:14:29Z) - Exact and general decoupled solutions of the LMC Multitask Gaussian Process model [28.32223907511862]
The Linear Model of Co-regionalization (LMC) is a very general model of multitask gaussian process for regression or classification.
Recent work has shown that under some conditions the latent processes of the model can be decoupled, leading to a complexity that is only linear in the number of said processes.
We here extend these results, showing from the most general assumptions that the only condition necessary to an efficient exact computation of the LMC is a mild hypothesis on the noise model.
arXiv Detail & Related papers (2023-10-18T15:16:24Z) - Efficient Propagation of Uncertainty via Reordering Monte Carlo Samples [0.7087237546722617]
Uncertainty propagation is a technique to determine model output uncertainties based on the uncertainty in its input variables.
In this work, we investigate the hypothesis that while all samples are useful on average, some samples must be more useful than others.
We introduce a methodology to adaptively reorder MC samples and show how it results in reduction of computational expense of UP processes.
arXiv Detail & Related papers (2023-02-09T21:28:15Z) - Fully Bayesian Autoencoders with Latent Sparse Gaussian Processes [23.682509357305406]
Autoencoders and their variants are among the most widely used models in representation learning and generative modeling.
We propose a novel Sparse Gaussian Process Bayesian Autoencoder model in which we impose fully sparse Gaussian Process priors on the latent space of a Bayesian Autoencoder.
arXiv Detail & Related papers (2023-02-09T09:57:51Z) - Predictable MDP Abstraction for Unsupervised Model-Based RL [93.91375268580806]
We propose predictable MDP abstraction (PMA)
Instead of training a predictive model on the original MDP, we train a model on a transformed MDP with a learned action space.
We theoretically analyze PMA and empirically demonstrate that PMA leads to significant improvements over prior unsupervised model-based RL approaches.
arXiv Detail & Related papers (2023-02-08T07:37:51Z) - Low-variance estimation in the Plackett-Luce model via quasi-Monte Carlo
sampling [58.14878401145309]
We develop a novel approach to producing more sample-efficient estimators of expectations in the PL model.
We illustrate our findings both theoretically and empirically using real-world recommendation data from Amazon Music and the Yahoo learning-to-rank challenge.
arXiv Detail & Related papers (2022-05-12T11:15:47Z) - Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z) - Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings.
We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z) - Learning Gaussian Graphical Models via Multiplicative Weights [54.252053139374205]
We adapt an algorithm of Klivans and Meka based on the method of multiplicative weight updates.
The algorithm enjoys a sample complexity bound that is qualitatively similar to others in the literature.
It has a low runtime $O(mp2)$ in the case of $m$ samples and $p$ nodes, and can trivially be implemented in an online manner.
arXiv Detail & Related papers (2020-02-20T10:50:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.