Entropy-based Training Methods for Scalable Neural Implicit Sampler
- URL: http://arxiv.org/abs/2306.04952v1
- Date: Thu, 8 Jun 2023 05:56:05 GMT
- Title: Entropy-based Training Methods for Scalable Neural Implicit Sampler
- Authors: Weijian Luo and Boya Zhang and Zhihua Zhang
- Abstract summary: Efficiently sampling from un-normalized target distributions is a fundamental problem in scientific computing and machine learning.
In this paper, we propose an efficient and scalable neural implicit sampler that overcomes these limitations.
Our sampler can generate large batches of samples with low computational costs by leveraging a neural transformation that directly maps easily sampled latent vectors to target samples.
- Score: 15.978655106034113
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficiently sampling from un-normalized target distributions is a fundamental
problem in scientific computing and machine learning. Traditional approaches
like Markov Chain Monte Carlo (MCMC) guarantee asymptotically unbiased samples
from such distributions but suffer from computational inefficiency,
particularly when dealing with high-dimensional targets, as they require
numerous iterations to generate a batch of samples. In this paper, we propose
an efficient and scalable neural implicit sampler that overcomes these
limitations. Our sampler can generate large batches of samples with low
computational costs by leveraging a neural transformation that directly maps
easily sampled latent vectors to target samples without the need for iterative
procedures. To train the neural implicit sampler, we introduce two novel
methods: the KL training method and the Fisher training method. The former
minimizes the Kullback-Leibler divergence, while the latter minimizes the
Fisher divergence. By employing these training methods, we effectively optimize
the neural implicit sampler to capture the desired target distribution. To
demonstrate the effectiveness, efficiency, and scalability of our proposed
samplers, we evaluate them on three sampling benchmarks with different scales.
These benchmarks include sampling from 2D targets, Bayesian inference, and
sampling from high-dimensional energy-based models (EBMs). Notably, in the
experiment involving high-dimensional EBMs, our sampler produces samples that
are comparable to those generated by MCMC-based methods while being more than
100 times more efficient, showcasing the efficiency of our neural sampler. We
believe that the theoretical and empirical contributions presented in this work
will stimulate further research on developing efficient samplers for various
applications beyond the ones explored in this study.
Related papers
- Denoising Fisher Training For Neural Implicit Samplers [3.744818211876898]
We introduce Denoising Fisher Training (DFT), a novel training approach for neural implicit samplers with theoretical guarantees.
DFT is empirically validated across diverse sampling benchmarks, including two-dimensional synthetic distribution, Bayesian logistic regression, and high-dimensional energy-based models (EBMs)
In experiments with high-dimensional EBMs, our best one-step DFT neural sampler achieves results on par with MCMC methods with up to 200 sampling steps, leading to a substantially greater efficiency over 100 times higher.
arXiv Detail & Related papers (2024-11-03T06:21:59Z) - Adaptive teachers for amortized samplers [76.88721198565861]
Amortized inference is the task of training a parametric model, such as a neural network, to approximate a distribution with a given unnormalized density where exact sampling is intractable.
Off-policy RL training facilitates the discovery of diverse, high-reward candidates, but existing methods still face challenges in efficient exploration.
We propose an adaptive training distribution (the Teacher) to guide the training of the primary amortized sampler (the Student) by prioritizing high-loss regions.
arXiv Detail & Related papers (2024-10-02T11:33:13Z) - Iterated Denoising Energy Matching for Sampling from Boltzmann Densities [109.23137009609519]
Iterated Denoising Energy Matching (iDEM)
iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our matching objective.
We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5times$ faster.
arXiv Detail & Related papers (2024-02-09T01:11:23Z) - Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL)
We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking.
We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z) - POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution
Samples [19.311470287767385]
We propose to use out-of-distribution samples, i.e., unlabeled samples coming from outside the target classes, to improve few-shot learning.
Our approach is simple to implement, agnostic to feature extractors, lightweight without any additional cost for pre-training, and applicable to both inductive and transductive settings.
arXiv Detail & Related papers (2022-06-08T18:59:21Z) - Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model.
We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z) - Unrolling Particles: Unsupervised Learning of Sampling Distributions [102.72972137287728]
Particle filtering is used to compute good nonlinear estimates of complex systems.
We show in simulations that the resulting particle filter yields good estimates in a wide range of scenarios.
arXiv Detail & Related papers (2021-10-06T16:58:34Z) - Reparameterized Sampling for Generative Adversarial Networks [71.30132908130581]
We propose REP-GAN, a novel sampling method that allows general dependent proposals by REizing the Markov chains into the latent space of the generator.
Empirically, extensive experiments on synthetic and real datasets demonstrate that our REP-GAN largely improves the sample efficiency and obtains better sample quality simultaneously.
arXiv Detail & Related papers (2021-07-01T10:34:55Z) - Nested Variational Inference [8.610608901689577]
We develop a family of methods that learn proposals for nested importance samplers by minimizing a KL divergence at each level of nesting.
We observe that optimizing nested objectives leads to improved sample quality in terms of log average weight and effective sample size.
arXiv Detail & Related papers (2021-06-21T17:56:59Z) - A Neural Network MCMC sampler that maximizes Proposal Entropy [3.4698840925433765]
Augmenting samplers with neural networks can potentially improve their efficiency.
Our network architecture utilizes the gradient of the target distribution for generating proposals.
The adaptive sampler achieves unbiased sampling with significantly higher proposal entropy than Langevin dynamics sampler.
arXiv Detail & Related papers (2020-10-07T18:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.