Related papers: Differentiating Metropolis-Hastings to Optimize Intractable Densities

Differentiating Metropolis-Hastings to Optimize Intractable Densities

URL: http://arxiv.org/abs/2306.07961v3
Date: Fri, 30 Jun 2023 20:33:36 GMT
Title: Differentiating Metropolis-Hastings to Optimize Intractable Densities
Authors: Gaurav Arya, Ruben Seyer, Frank Sch\"afer, Kartik Chandra, Alexander K. Lew, Mathieu Huot, Vikash K. Mansinghka, Jonathan Ragan-Kelley, Christopher Rackauckas and Moritz Schauer
Abstract summary: We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers. We apply gradient-based optimization to objectives expressed as expectations over intractable target densities.
Score: 51.16801956665228
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers, allowing us to differentiate through probabilistic inference, even if the model has discrete components within it. Our approach fuses recent advances in stochastic automatic differentiation with traditional Markov chain coupling schemes, providing an unbiased and low-variance gradient estimator. This allows us to apply gradient-based optimization to objectives expressed as expectations over intractable target densities. We demonstrate our approach by finding an ambiguous observation in a Gaussian mixture model and by maximizing the specific heat in an Ising model.

Related papers

Improving Probabilistic Diffusion Models With Optimal Covariance Matching [27.2761325416843]
We introduce a novel method for learning the diagonal covariances. We show how our method can substantially enhance the sampling efficiency, recall rate and likelihood of both diffusion models and latent diffusion models.
arXiv Detail & Related papers (2024-06-16T05:47:12Z)
Covariance-Adaptive Sequential Black-box Optimization for Diffusion Targeted Generation [60.41803046775034]
We show how to perform user-preferred targeted generation via diffusion models with only black-box target scores of users. Experiments on both numerical test problems and target-guided 3D-molecule generation tasks show the superior performance of our method in achieving better target scores.
arXiv Detail & Related papers (2024-06-02T17:26:27Z)
Variational autoencoder with weighted samples for high-dimensional non-parametric adaptive importance sampling [0.0]
We extend the existing framework to the case of weighted samples by introducing a new objective function. In order to add flexibility to the model and to be able to learn multimodal distributions, we consider a learnable prior distribution. We exploit the proposed procedure in existing adaptive importance sampling algorithms to draw points from a target distribution and to estimate a rare event probability in high dimension.
arXiv Detail & Related papers (2023-10-13T15:40:55Z)
Score-based Continuous-time Discrete Diffusion Models [102.65769839899315]
We extend diffusion models to discrete variables by introducing a Markov jump process where the reverse process denoises via a continuous-time Markov chain. We show that an unbiased estimator can be obtained via simple matching the conditional marginal distributions. We demonstrate the effectiveness of the proposed method on a set of synthetic and real-world music and image benchmarks.
arXiv Detail & Related papers (2022-11-30T05:33:29Z)
Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models. We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling. We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z)
Adaptive Perturbation-Based Gradient Estimation for Discrete Latent Variable Models [28.011868604717726]
We present Adaptive IMLE, the first adaptive gradient estimator for complex discrete distributions. We show that our estimator can produce faithful estimates while requiring orders of magnitude fewer samples than other gradient estimators.
arXiv Detail & Related papers (2022-09-11T13:32:39Z)
Distributional Gradient Boosting Machines [77.34726150561087]
Our framework is based on XGBoost and LightGBM. We show that our framework achieves state-of-the-art forecast accuracy.
arXiv Detail & Related papers (2022-04-02T06:32:19Z)
A Stochastic Newton Algorithm for Distributed Convex Optimization [62.20732134991661]
We analyze a Newton algorithm for homogeneous distributed convex optimization, where each machine can calculate gradients of the same population objective. We show that our method can reduce the number, and frequency, of required communication rounds compared to existing methods without hurting performance.
arXiv Detail & Related papers (2021-10-07T17:51:10Z)
Model Selection for Bayesian Autoencoders [25.619565817793422]
We propose to optimize the distributional sliced-Wasserstein distance between the output of the autoencoder and the empirical data distribution. We turn our BAE into a generative model by fitting a flexible Dirichlet mixture model in the latent space. We evaluate our approach qualitatively and quantitatively using a vast experimental campaign on a number of unsupervised learning tasks and show that, in small-data regimes where priors matter, our approach provides state-of-the-art results.
arXiv Detail & Related papers (2021-06-11T08:55:00Z)
Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z)
Autoregressive Denoising Diffusion Models for Multivariate Probabilistic Time Series Forecasting [4.1573460459258245]
We use diffusion probabilistic models, a class of latent variable models closely connected to score matching and energy-based methods. Our model learns gradients by optimizing a variational bound on the data likelihood and at inference time converts white noise into a sample of the distribution of interest.
arXiv Detail & Related papers (2021-01-28T15:46:10Z)
Unbiased Gradient Estimation for Variational Auto-Encoders using Coupled Markov Chains [34.77971292478243]
The variational auto-encoder (VAE) is a deep latent variable model that has two neural networks in an autoencoder-like architecture. We develop a training scheme for VAEs by introducing unbiased estimators of the log-likelihood gradient. We show experimentally that VAEs fitted with unbiased estimators exhibit better predictive performance.
arXiv Detail & Related papers (2020-10-05T08:11:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.