A Neural Network MCMC sampler that maximizes Proposal Entropy
- URL: http://arxiv.org/abs/2010.03587v1
- Date: Wed, 7 Oct 2020 18:01:38 GMT
- Title: A Neural Network MCMC sampler that maximizes Proposal Entropy
- Authors: Zengyi Li, Yubei Chen, Friedrich T. Sommer
- Abstract summary: Augmenting samplers with neural networks can potentially improve their efficiency.
Our network architecture utilizes the gradient of the target distribution for generating proposals.
The adaptive sampler achieves unbiased sampling with significantly higher proposal entropy than Langevin dynamics sampler.
- Score: 3.4698840925433765
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Markov Chain Monte Carlo (MCMC) methods sample from unnormalized probability
distributions and offer guarantees of exact sampling. However, in the
continuous case, unfavorable geometry of the target distribution can greatly
limit the efficiency of MCMC methods. Augmenting samplers with neural networks
can potentially improve their efficiency. Previous neural network based
samplers were trained with objectives that either did not explicitly encourage
exploration, or used a L2 jump objective which could only be applied to well
structured distributions. Thus it seems promising to instead maximize the
proposal entropy for adapting the proposal to distributions of any shape. To
allow direct optimization of the proposal entropy, we propose a neural network
MCMC sampler that has a flexible and tractable proposal distribution.
Specifically, our network architecture utilizes the gradient of the target
distribution for generating proposals. Our model achieves significantly higher
efficiency than previous neural network MCMC techniques in a variety of
sampling tasks. Further, the sampler is applied on training of a convergent
energy-based model of natural images. The adaptive sampler achieves unbiased
sampling with significantly higher proposal entropy than Langevin dynamics
sampler.
Related papers
- Iterated Denoising Energy Matching for Sampling from Boltzmann Densities [109.23137009609519]
Iterated Denoising Energy Matching (iDEM)
iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our matching objective.
We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5times$ faster.
arXiv Detail & Related papers (2024-02-09T01:11:23Z) - A Block Metropolis-Hastings Sampler for Controllable Energy-based Text
Generation [78.81021361497311]
We develop a novel Metropolis-Hastings (MH) sampler that proposes re-writes of the entire sequence in each step via iterative prompting of a large language model.
Our new sampler allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance.
arXiv Detail & Related papers (2023-12-07T18:30:15Z) - Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL)
We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking.
We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z) - Entropy-based Training Methods for Scalable Neural Implicit Sampler [15.978655106034113]
Efficiently sampling from un-normalized target distributions is a fundamental problem in scientific computing and machine learning.
In this paper, we propose an efficient and scalable neural implicit sampler that overcomes these limitations.
Our sampler can generate large batches of samples with low computational costs by leveraging a neural transformation that directly maps easily sampled latent vectors to target samples.
arXiv Detail & Related papers (2023-06-08T05:56:05Z) - Estimating Regression Predictive Distributions with Sample Networks [17.935136717050543]
A common approach to model uncertainty is to choose a parametric distribution and fit the data to it using maximum likelihood estimation.
The chosen parametric form can be a poor fit to the data-generating distribution, resulting in unreliable uncertainty estimates.
We propose SampleNet, a flexible and scalable architecture for modeling uncertainty that avoids specifying a parametric form on the output distribution.
arXiv Detail & Related papers (2022-11-24T17:23:29Z) - Sampling from Discrete Energy-Based Models with Quality/Efficiency
Trade-offs [3.491202838583993]
Energy-Based Models (EBMs) allow for extremely flexible specifications of probability distributions.
They do not provide a mechanism for obtaining exact samples from these distributions.
We propose a new approximate sampling technique, Quasi Rejection Sampling (QRS), that allows for a trade-off between sampling efficiency and sampling quality.
arXiv Detail & Related papers (2021-12-10T17:51:37Z) - LSB: Local Self-Balancing MCMC in Discrete Spaces [2.385916960125935]
This work considers using machine learning to adapt the proposal distribution to the target, in order to improve the sampling efficiency in the purely discrete domain.
We call the resulting sampler as the Locally Self-Balancing Sampler (LSB)
arXiv Detail & Related papers (2021-09-08T18:31:26Z) - Reparameterized Sampling for Generative Adversarial Networks [71.30132908130581]
We propose REP-GAN, a novel sampling method that allows general dependent proposals by REizing the Markov chains into the latent space of the generator.
Empirically, extensive experiments on synthetic and real datasets demonstrate that our REP-GAN largely improves the sample efficiency and obtains better sample quality simultaneously.
arXiv Detail & Related papers (2021-07-01T10:34:55Z) - Exposing the Implicit Energy Networks behind Masked Language Models via
Metropolis--Hastings [57.133639209759615]
We interpret sequences as energy-based sequence models and propose two energy parametrizations derivable from traineds.
We develop a tractable emph scheme based on the Metropolis-Hastings Monte Carlo algorithm.
We validate the effectiveness of the proposed parametrizations by exploring the quality of samples drawn from these energy-based models.
arXiv Detail & Related papers (2021-06-04T22:04:30Z) - Achieving Efficiency in Black Box Simulation of Distribution Tails with
Self-structuring Importance Samplers [1.6114012813668934]
The paper presents a novel Importance Sampling (IS) scheme for estimating distribution of performance measures modeled with a rich set of tools such as linear programs, integer linear programs, piecewise linear/quadratic objectives, feature maps specified with deep neural networks, etc.
arXiv Detail & Related papers (2021-02-14T03:37:22Z) - Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings.
We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.