Related papers: A Neural Network MCMC sampler that maximizes Proposal Entropy

A Neural Network MCMC sampler that maximizes Proposal Entropy

URL: http://arxiv.org/abs/2010.03587v1
Date: Wed, 7 Oct 2020 18:01:38 GMT
Title: A Neural Network MCMC sampler that maximizes Proposal Entropy
Authors: Zengyi Li, Yubei Chen, Friedrich T. Sommer
Abstract summary: Augmenting samplers with neural networks can potentially improve their efficiency. Our network architecture utilizes the gradient of the target distribution for generating proposals. The adaptive sampler achieves unbiased sampling with significantly higher proposal entropy than Langevin dynamics sampler.
Score: 3.4698840925433765
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Markov Chain Monte Carlo (MCMC) methods sample from unnormalized probability distributions and offer guarantees of exact sampling. However, in the continuous case, unfavorable geometry of the target distribution can greatly limit the efficiency of MCMC methods. Augmenting samplers with neural networks can potentially improve their efficiency. Previous neural network based samplers were trained with objectives that either did not explicitly encourage exploration, or used a L2 jump objective which could only be applied to well structured distributions. Thus it seems promising to instead maximize the proposal entropy for adapting the proposal to distributions of any shape. To allow direct optimization of the proposal entropy, we propose a neural network MCMC sampler that has a flexible and tractable proposal distribution. Specifically, our network architecture utilizes the gradient of the target distribution for generating proposals. Our model achieves significantly higher efficiency than previous neural network MCMC techniques in a variety of sampling tasks. Further, the sampler is applied on training of a convergent energy-based model of natural images. The adaptive sampler achieves unbiased sampling with significantly higher proposal entropy than Langevin dynamics sampler.

Related papers

Fake it till You Make it: Reward Modeling as Discriminative Prediction [49.31309674007382]
GAN-RM is an efficient reward modeling framework that eliminates manual preference annotation and explicit quality dimension engineering.<n>Our method trains the reward model through discrimination between a small set of representative, unpaired target samples.<n>Experiments demonstrate our GAN-RM's effectiveness across multiple key applications.
arXiv Detail & Related papers (2025-06-16T17:59:40Z)
Generative Diffusion Models for Resource Allocation in Wireless Networks [77.36145730415045]
We train a policy to imitate an expert and generate new samples from the optimal distribution. We achieve near-optimal performance through sequential execution of the generated samples. We present numerical results in a case study of power control in multi-user interference networks.
arXiv Detail & Related papers (2025-04-28T21:44:31Z)
Preference Optimization via Contrastive Divergence: Your Reward Model is Secretly an NLL Estimator [32.05337749590184]
We develop a novel PO framework that provides theoretical guidance to effectively sample dispreferred completions. We then select contrastive divergence (CD) as sampling strategy, and propose a novel MC-PO algorithm. OnMC-PO outperforms existing SOTA baselines, and OnMC-PO leads to further improvement.
arXiv Detail & Related papers (2025-02-06T23:45:08Z)
Enhanced Importance Sampling through Latent Space Exploration in Normalizing Flows [69.8873421870522]
importance sampling is a rare event simulation technique used in Monte Carlo simulations. We propose a method for more efficient sampling by updating the proposal distribution in the latent space of a normalizing flow.
arXiv Detail & Related papers (2025-01-06T21:18:02Z)
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities [109.23137009609519]
Iterated Denoising Energy Matching (iDEM) iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our matching objective. We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5times$ faster.
arXiv Detail & Related papers (2024-02-09T01:11:23Z)
A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation [78.81021361497311]
We develop a novel Metropolis-Hastings (MH) sampler that proposes re-writes of the entire sequence in each step via iterative prompting of a large language model. Our new sampler allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance.
arXiv Detail & Related papers (2023-12-07T18:30:15Z)
Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL) We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking. We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z)
Entropy-based Training Methods for Scalable Neural Implicit Sampler [15.978655106034113]
Efficiently sampling from un-normalized target distributions is a fundamental problem in scientific computing and machine learning. In this paper, we propose an efficient and scalable neural implicit sampler that overcomes these limitations. Our sampler can generate large batches of samples with low computational costs by leveraging a neural transformation that directly maps easily sampled latent vectors to target samples.
arXiv Detail & Related papers (2023-06-08T05:56:05Z)
Estimating Regression Predictive Distributions with Sample Networks [17.935136717050543]
A common approach to model uncertainty is to choose a parametric distribution and fit the data to it using maximum likelihood estimation. The chosen parametric form can be a poor fit to the data-generating distribution, resulting in unreliable uncertainty estimates. We propose SampleNet, a flexible and scalable architecture for modeling uncertainty that avoids specifying a parametric form on the output distribution.
arXiv Detail & Related papers (2022-11-24T17:23:29Z)
Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs [3.491202838583993]
Energy-Based Models (EBMs) allow for extremely flexible specifications of probability distributions. They do not provide a mechanism for obtaining exact samples from these distributions. We propose a new approximate sampling technique, Quasi Rejection Sampling (QRS), that allows for a trade-off between sampling efficiency and sampling quality.
arXiv Detail & Related papers (2021-12-10T17:51:37Z)
LSB: Local Self-Balancing MCMC in Discrete Spaces [2.385916960125935]
This work considers using machine learning to adapt the proposal distribution to the target, in order to improve the sampling efficiency in the purely discrete domain. We call the resulting sampler as the Locally Self-Balancing Sampler (LSB)
arXiv Detail & Related papers (2021-09-08T18:31:26Z)
Reparameterized Sampling for Generative Adversarial Networks [71.30132908130581]
We propose REP-GAN, a novel sampling method that allows general dependent proposals by REizing the Markov chains into the latent space of the generator. Empirically, extensive experiments on synthetic and real datasets demonstrate that our REP-GAN largely improves the sample efficiency and obtains better sample quality simultaneously.
arXiv Detail & Related papers (2021-07-01T10:34:55Z)
Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis--Hastings [57.133639209759615]
We interpret sequences as energy-based sequence models and propose two energy parametrizations derivable from traineds. We develop a tractable emph scheme based on the Metropolis-Hastings Monte Carlo algorithm. We validate the effectiveness of the proposed parametrizations by exploring the quality of samples drawn from these energy-based models.
arXiv Detail & Related papers (2021-06-04T22:04:30Z)
Achieving Efficiency in Black Box Simulation of Distribution Tails with Self-structuring Importance Samplers [1.6114012813668934]
The paper presents a novel Importance Sampling (IS) scheme for estimating distribution of performance measures modeled with a rich set of tools such as linear programs, integer linear programs, piecewise linear/quadratic objectives, feature maps specified with deep neural networks, etc.
arXiv Detail & Related papers (2021-02-14T03:37:22Z)
Oops I Took A Gradient: Scalable Sampling for Discrete Distributions [53.3142984019796]
We show that this approach outperforms generic samplers in a number of difficult settings. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data.
arXiv Detail & Related papers (2021-02-08T20:08:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.