Massively Parallel Reweighted Wake-Sleep
- URL: http://arxiv.org/abs/2305.11022v1
- Date: Thu, 18 May 2023 15:03:56 GMT
- Title: Massively Parallel Reweighted Wake-Sleep
- Authors: Thomas Heap, Gavin Leech, Laurence Aitchison
- Abstract summary: Re-weighted wake-sleep (RWS) is a machine learning method for performing Bayesian inference in a very general class of models.
Recent work indicates that the number of samples required for effective importance weighting is exponential in the number of latent variables.
We show considerable improvements over standard "global" RWS, which draws $K$ samples from the full joint.
- Score: 29.436464740855598
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reweighted wake-sleep (RWS) is a machine learning method for performing
Bayesian inference in a very general class of models. RWS draws $K$ samples
from an underlying approximate posterior, then uses importance weighting to
provide a better estimate of the true posterior. RWS then updates its
approximate posterior towards the importance-weighted estimate of the true
posterior. However, recent work [Chattergee and Diaconis, 2018] indicates that
the number of samples required for effective importance weighting is
exponential in the number of latent variables. Attaining such a large number of
importance samples is intractable in all but the smallest models. Here, we
develop massively parallel RWS, which circumvents this issue by drawing $K$
samples of all $n$ latent variables, and individually reasoning about all $K^n$
possible combinations of samples. While reasoning about $K^n$ combinations
might seem intractable, the required computations can be performed in
polynomial time by exploiting conditional independencies in the generative
model. We show considerable improvements over standard "global" RWS, which
draws $K$ samples from the full joint.
Related papers
- Consistency Model is an Effective Posterior Sample Approximation for Diffusion Inverse Solvers [28.678613691787096]
Previous approximations rely on the posterior means, which may not lie in the support of the image distribution.
We introduce a novel approach for posterior approximation that guarantees to generate valid samples within the support of the image distribution.
arXiv Detail & Related papers (2024-02-09T02:23:47Z) - Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond [89.72693227960274]
This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions.
To reduce the number of samples in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts and the other executes an online algorithm for non-oblivious multi-armed bandits.
In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of distributions.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Bias Mimicking: A Simple Sampling Approach for Bias Mitigation [57.17709477668213]
We introduce a new class-conditioned sampling method: Bias Mimicking.
Bias Mimicking improves underrepresented groups' accuracy of sampling methods by 3% over four benchmarks.
arXiv Detail & Related papers (2022-09-30T17:33:00Z) - Adaptive Sketches for Robust Regression with Importance Sampling [64.75899469557272]
We introduce data structures for solving robust regression through gradient descent (SGD)
Our algorithm effectively runs $T$ steps of SGD with importance sampling while using sublinear space and just making a single pass over the data.
arXiv Detail & Related papers (2022-07-16T03:09:30Z) - A fast asynchronous MCMC sampler for sparse Bayesian inference [10.535140830570256]
We propose a very fast approximate Markov Chain Monte Carlo (MCMC) sampling framework that is applicable to a large class of sparse Bayesian inference problems.
We show that in high-dimensional linear regression problems, the Markov chain generated by the proposed algorithm admits an invariant distribution that recovers correctly the main signal.
arXiv Detail & Related papers (2021-08-14T02:20:49Z) - Improving Robustness and Generality of NLP Models Using Disentangled
Representations [62.08794500431367]
Supervised neural networks first map an input $x$ to a single representation $z$, and then map $z$ to the output label $y$.
We present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning.
We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
arXiv Detail & Related papers (2020-09-21T02:48:46Z) - WOR and $p$'s: Sketches for $\ell_p$-Sampling Without Replacement [75.12782480740822]
We design novel composable sketches for WOR $ell_p$ sampling.
Our sketches have size that grows only linearly with the sample size.
Our method is the first to provide WOR sampling in the important regime of $p>1$ and the first to handle signed updates for $p>0$.
arXiv Detail & Related papers (2020-07-14T00:19:27Z) - FANOK: Knockoffs in Linear Time [73.5154025911318]
We describe a series of algorithms that efficiently implement Gaussian model-X knockoffs to control the false discovery rate on large scale feature selection problems.
We test our methods on problems with $p$ as large as $500,000$.
arXiv Detail & Related papers (2020-06-15T21:55:34Z) - Confidence sequences for sampling without replacement [39.98103047898974]
We present a suite of tools for designing confidence sequences for $thetastar$.
A CS is a sequence of confidence sets $(C_n)_n=1N$, that shrink in size, and all contain $thetastar$ simultaneously with high probability.
We then present Hoeffding- and empirical-Bernstein-type time-uniform CSs and fixed-time confidence intervals for sampling WoR.
arXiv Detail & Related papers (2020-06-08T04:30:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.