Structured Stochastic Gradient MCMC
- URL: http://arxiv.org/abs/2107.09028v1
- Date: Mon, 19 Jul 2021 17:18:10 GMT
- Title: Structured Stochastic Gradient MCMC
- Authors: Antonios Alexos, Alex Boyd, Stephan Mandt
- Abstract summary: We propose a new non-parametric variational approximation that makes no assumptions about the approximate posterior's functional form.
We obtain better predictive likelihoods and larger effective sample sizes than full SGMCMC.
- Score: 20.68905354115655
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Stochastic gradient Markov chain Monte Carlo (SGMCMC) is considered the gold
standard for Bayesian inference in large-scale models, such as Bayesian neural
networks. Since practitioners face speed versus accuracy tradeoffs in these
models, variational inference (VI) is often the preferable option.
Unfortunately, VI makes strong assumptions on both the factorization and
functional form of the posterior. In this work, we propose a new non-parametric
variational approximation that makes no assumptions about the approximate
posterior's functional form and allows practitioners to specify the exact
dependencies the algorithm should respect or break. The approach relies on a
new Langevin-type algorithm that operates on a modified energy function, where
parts of the latent variables are averaged over samples from earlier iterations
of the Markov chain. This way, statistical dependencies can be broken in a
controlled way, allowing the chain to mix faster. This scheme can be further
modified in a ''dropout'' manner, leading to even more scalability. By
implementing the scheme on a ResNet-20 architecture, we obtain better
predictive likelihoods and larger effective sample sizes than full SGMCMC.
Related papers
- Learning variational autoencoders via MCMC speed measures [7.688686113950604]
Variational autoencoders (VAEs) are popular likelihood-based generative models.
This work suggests an entropy-based adaptation for a short-run Metropolis-adjusted Langevin (MALA) or Hamiltonian Monte Carlo (HMC) chain.
Experiments show that this approach yields higher held-out log-likelihoods as well as improved generative metrics.
arXiv Detail & Related papers (2023-08-26T02:15:51Z) - Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables.
We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption.
We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z) - Langevin Autoencoders for Learning Deep Latent Variable Models [27.60436426879683]
We present a new deep latent variable model named the Langevin autoencoder (LAE)
Based on the ALD, we also present a new deep latent variable model named the Langevin autoencoder (LAE)
arXiv Detail & Related papers (2022-09-15T04:26:22Z) - Scalable Stochastic Parametric Verification with Stochastic Variational
Smoothed Model Checking [1.5293427903448025]
Smoothed model checking (smMC) aims at inferring the satisfaction function over the entire parameter space from a limited set of observations.
In this paper, we exploit recent advances in probabilistic machine learning to push this limitation forward.
We compare the performances of smMC against those of SV-smMC by looking at the scalability, the computational efficiency and the accuracy of the reconstructed satisfaction function.
arXiv Detail & Related papers (2022-05-11T10:43:23Z) - A new perspective on probabilistic image modeling [92.89846887298852]
We present a new probabilistic approach for image modeling capable of density estimation, sampling and tractable inference.
DCGMMs can be trained end-to-end by SGD from random initial conditions, much like CNNs.
We show that DCGMMs compare favorably to several recent PC and SPN models in terms of inference, classification and sampling.
arXiv Detail & Related papers (2022-03-21T14:53:57Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - Anomaly Detection of Time Series with Smoothness-Inducing Sequential
Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series.
Our model parameterizes mean and variance for each time-stamp with flexible neural networks.
We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z) - Gaussian MRF Covariance Modeling for Efficient Black-Box Adversarial
Attacks [86.88061841975482]
We study the problem of generating adversarial examples in a black-box setting, where we only have access to a zeroth order oracle.
We use this setting to find fast one-step adversarial attacks, akin to a black-box version of the Fast Gradient Sign Method(FGSM)
We show that the method uses fewer queries and achieves higher attack success rates than the current state of the art.
arXiv Detail & Related papers (2020-10-08T18:36:51Z) - An adaptive Hessian approximated stochastic gradient MCMC method [12.93317525451798]
We present an adaptive Hessian approximated gradient MCMC method to incorporate local geometric information while sampling from the posterior.
We adopt a magnitude-based weight pruning method to enforce the sparsity of the network.
arXiv Detail & Related papers (2020-10-03T16:22:15Z) - Particle-Gibbs Sampling For Bayesian Feature Allocation Models [77.57285768500225]
Most widely used MCMC strategies rely on an element wise Gibbs update of the feature allocation matrix.
We have developed a Gibbs sampler that can update an entire row of the feature allocation matrix in a single move.
This sampler is impractical for models with a large number of features as the computational complexity scales exponentially in the number of features.
We develop a Particle Gibbs sampler that targets the same distribution as the row wise Gibbs updates, but has computational complexity that only grows linearly in the number of features.
arXiv Detail & Related papers (2020-01-25T22:11:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.