Related papers: Entropy-based Training Methods for Scalable Neural Implicit Sampler

Entropy-based Training Methods for Scalable Neural Implicit Sampler

URL: http://arxiv.org/abs/2306.04952v1
Date: Thu, 8 Jun 2023 05:56:05 GMT
Title: Entropy-based Training Methods for Scalable Neural Implicit Sampler
Authors: Weijian Luo and Boya Zhang and Zhihua Zhang
Abstract summary: Efficiently sampling from un-normalized target distributions is a fundamental problem in scientific computing and machine learning. In this paper, we propose an efficient and scalable neural implicit sampler that overcomes these limitations. Our sampler can generate large batches of samples with low computational costs by leveraging a neural transformation that directly maps easily sampled latent vectors to target samples.
Score: 15.978655106034113
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Efficiently sampling from un-normalized target distributions is a fundamental problem in scientific computing and machine learning. Traditional approaches like Markov Chain Monte Carlo (MCMC) guarantee asymptotically unbiased samples from such distributions but suffer from computational inefficiency, particularly when dealing with high-dimensional targets, as they require numerous iterations to generate a batch of samples. In this paper, we propose an efficient and scalable neural implicit sampler that overcomes these limitations. Our sampler can generate large batches of samples with low computational costs by leveraging a neural transformation that directly maps easily sampled latent vectors to target samples without the need for iterative procedures. To train the neural implicit sampler, we introduce two novel methods: the KL training method and the Fisher training method. The former minimizes the Kullback-Leibler divergence, while the latter minimizes the Fisher divergence. By employing these training methods, we effectively optimize the neural implicit sampler to capture the desired target distribution. To demonstrate the effectiveness, efficiency, and scalability of our proposed samplers, we evaluate them on three sampling benchmarks with different scales. These benchmarks include sampling from 2D targets, Bayesian inference, and sampling from high-dimensional energy-based models (EBMs). Notably, in the experiment involving high-dimensional EBMs, our sampler produces samples that are comparable to those generated by MCMC-based methods while being more than 100 times more efficient, showcasing the efficiency of our neural sampler. We believe that the theoretical and empirical contributions presented in this work will stimulate further research on developing efficient samplers for various applications beyond the ones explored in this study.

Related papers

Single-Step Consistent Diffusion Samplers [8.758218443992467]
Existing sampling algorithms typically require many iterative steps to produce high-quality samples. We introduce consistent diffusion samplers, a new class of samplers designed to generate high-fidelity samples in a single step. We show that our approach yields high-fidelity samples using less than 1% of the network evaluations required by traditional diffusion samplers.
arXiv Detail & Related papers (2025-02-11T14:25:52Z)
Neural Flow Samplers with Shortcut Models [19.81513273510523]
Flow-based samplers generate samples by learning a velocity field that satisfies the continuity equation. While importance sampling provides an approximation, it suffers from high variance.
arXiv Detail & Related papers (2025-02-11T07:55:41Z)
Denoising Fisher Training For Neural Implicit Samplers [3.744818211876898]
We introduce Denoising Fisher Training (DFT), a novel training approach for neural implicit samplers with theoretical guarantees. DFT is empirically validated across diverse sampling benchmarks, including two-dimensional synthetic distribution, Bayesian logistic regression, and high-dimensional energy-based models (EBMs) In experiments with high-dimensional EBMs, our best one-step DFT neural sampler achieves results on par with MCMC methods with up to 200 sampling steps, leading to a substantially greater efficiency over 100 times higher.
arXiv Detail & Related papers (2024-11-03T06:21:59Z)
Adaptive teachers for amortized samplers [76.88721198565861]
Amortized inference is the task of training a parametric model, such as a neural network, to approximate a distribution with a given unnormalized density where exact sampling is intractable. Off-policy RL training facilitates the discovery of diverse, high-reward candidates, but existing methods still face challenges in efficient exploration. We propose an adaptive training distribution (the Teacher) to guide the training of the primary amortized sampler (the Student) by prioritizing high-loss regions.
arXiv Detail & Related papers (2024-10-02T11:33:13Z)
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities [109.23137009609519]
Iterated Denoising Energy Matching (iDEM) iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our matching objective. We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5times$ faster.
arXiv Detail & Related papers (2024-02-09T01:11:23Z)
Self-Supervised Dataset Distillation for Transfer Learning [77.4714995131992]
We propose a novel problem of distilling an unlabeled dataset into a set of small synthetic samples for efficient self-supervised learning (SSL) We first prove that a gradient of synthetic samples with respect to a SSL objective in naive bilevel optimization is textitbiased due to randomness originating from data augmentations or masking. We empirically validate the effectiveness of our method on various applications involving transfer learning.
arXiv Detail & Related papers (2023-10-10T10:48:52Z)
POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution Samples [19.311470287767385]
We propose to use out-of-distribution samples, i.e., unlabeled samples coming from outside the target classes, to improve few-shot learning. Our approach is simple to implement, agnostic to feature extractors, lightweight without any additional cost for pre-training, and applicable to both inductive and transductive settings.
arXiv Detail & Related papers (2022-06-08T18:59:21Z)
Sampling from Arbitrary Functions via PSD Models [55.41644538483948]
We take a two-step approach by first modeling the probability distribution and then sampling from that model. We show that these models can approximate a large class of densities concisely using few evaluations, and present a simple algorithm to effectively sample from these models.
arXiv Detail & Related papers (2021-10-20T12:25:22Z)
Unrolling Particles: Unsupervised Learning of Sampling Distributions [102.72972137287728]
Particle filtering is used to compute good nonlinear estimates of complex systems. We show in simulations that the resulting particle filter yields good estimates in a wide range of scenarios.
arXiv Detail & Related papers (2021-10-06T16:58:34Z)
Reparameterized Sampling for Generative Adversarial Networks [71.30132908130581]
We propose REP-GAN, a novel sampling method that allows general dependent proposals by REizing the Markov chains into the latent space of the generator. Empirically, extensive experiments on synthetic and real datasets demonstrate that our REP-GAN largely improves the sample efficiency and obtains better sample quality simultaneously.
arXiv Detail & Related papers (2021-07-01T10:34:55Z)
Nested Variational Inference [8.610608901689577]
We develop a family of methods that learn proposals for nested importance samplers by minimizing a KL divergence at each level of nesting. We observe that optimizing nested objectives leads to improved sample quality in terms of log average weight and effective sample size.
arXiv Detail & Related papers (2021-06-21T17:56:59Z)
A Neural Network MCMC sampler that maximizes Proposal Entropy [3.4698840925433765]
Augmenting samplers with neural networks can potentially improve their efficiency. Our network architecture utilizes the gradient of the target distribution for generating proposals. The adaptive sampler achieves unbiased sampling with significantly higher proposal entropy than Langevin dynamics sampler.
arXiv Detail & Related papers (2020-10-07T18:01:38Z)
Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs) These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.