Incremental Sampling Without Replacement for Sequence Models
- URL: http://arxiv.org/abs/2002.09067v2
- Date: Tue, 20 Jul 2021 00:09:38 GMT
- Title: Incremental Sampling Without Replacement for Sequence Models
- Authors: Kensen Shi, David Bieber, Charles Sutton
- Abstract summary: We present an elegant procedure for sampling without replacement from a broad class of randomized programs.
Our approach is incremental, i.e., samples can be drawn one at a time, allowing for increased flexibility.
- Score: 39.3035292844624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sampling is a fundamental technique, and sampling without replacement is
often desirable when duplicate samples are not beneficial. Within machine
learning, sampling is useful for generating diverse outputs from a trained
model. We present an elegant procedure for sampling without replacement from a
broad class of randomized programs, including generative neural models that
construct outputs sequentially. Our procedure is efficient even for
exponentially-large output spaces. Unlike prior work, our approach is
incremental, i.e., samples can be drawn one at a time, allowing for increased
flexibility. We also present a new estimator for computing expectations from
samples drawn without replacement. We show that incremental sampling without
replacement is applicable to many domains, e.g., program synthesis and
combinatorial optimization.
Related papers
- Adaptive Selection of Sampling-Reconstruction in Fourier Compressed Sensing [13.775902519100075]
Compressed sensing (CS) has emerged to overcome the inefficiency of Nyquist sampling.
Deep learning-based reconstruction has been a promising alternative to optimization-based reconstruction.
arXiv Detail & Related papers (2024-09-18T06:51:29Z) - Priority Sampling of Large Language Models for Compilers [4.2266182821287135]
Priority Sampling is a simple and deterministic sampling technique that produces unique samples ordered by the model's confidence.
It supports generation based on regular expression that provides a controllable and structured exploration process.
It outperforms the autotuner used for the generation of labels for the training of the original model in just 30 samples.
arXiv Detail & Related papers (2024-02-28T22:27:49Z) - Towards Automated Imbalanced Learning with Deep Hierarchical
Reinforcement Learning [57.163525407022966]
Imbalanced learning is a fundamental challenge in data mining, where there is a disproportionate ratio of training samples in each class.
Over-sampling is an effective technique to tackle imbalanced learning through generating synthetic samples for the minority class.
We propose AutoSMOTE, an automated over-sampling algorithm that can jointly optimize different levels of decisions.
arXiv Detail & Related papers (2022-08-26T04:28:01Z) - Adaptive Sketches for Robust Regression with Importance Sampling [64.75899469557272]
We introduce data structures for solving robust regression through gradient descent (SGD)
Our algorithm effectively runs $T$ steps of SGD with importance sampling while using sublinear space and just making a single pass over the data.
arXiv Detail & Related papers (2022-07-16T03:09:30Z) - Diverse Human Motion Prediction via Gumbel-Softmax Sampling from an
Auxiliary Space [34.83587750498361]
Diverse human motion prediction aims at predicting multiple possible future pose sequences from a sequence of observed poses.
Previous approaches usually employ deep generative networks to model the conditional distribution of data, and then randomly sample outcomes from the distribution.
We propose a novel sampling strategy for sampling very diverse results from an imbalanced multimodal distribution.
arXiv Detail & Related papers (2022-07-15T09:03:57Z) - POODLE: Improving Few-shot Learning via Penalizing Out-of-Distribution
Samples [19.311470287767385]
We propose to use out-of-distribution samples, i.e., unlabeled samples coming from outside the target classes, to improve few-shot learning.
Our approach is simple to implement, agnostic to feature extractors, lightweight without any additional cost for pre-training, and applicable to both inductive and transductive settings.
arXiv Detail & Related papers (2022-06-08T18:59:21Z) - Reparameterized Sampling for Generative Adversarial Networks [71.30132908130581]
We propose REP-GAN, a novel sampling method that allows general dependent proposals by REizing the Markov chains into the latent space of the generator.
Empirically, extensive experiments on synthetic and real datasets demonstrate that our REP-GAN largely improves the sample efficiency and obtains better sample quality simultaneously.
arXiv Detail & Related papers (2021-07-01T10:34:55Z) - A Constant-time Adaptive Negative Sampling [33.585006286223994]
We show a class of distribution where the sampling scheme is truly adaptive and provably generates negative samples in constant time.
Our implementation in C++ on commodity CPU is significantly faster, in terms of wall clock time.
arXiv Detail & Related papers (2020-12-31T18:56:41Z) - Non-Adaptive Adaptive Sampling on Turnstile Streams [57.619901304728366]
We give the first relative-error algorithms for column subset selection, subspace approximation, projective clustering, and volume on turnstile streams that use space sublinear in $n$.
Our adaptive sampling procedure has a number of applications to various data summarization problems that either improve state-of-the-art or have only been previously studied in the more relaxed row-arrival model.
arXiv Detail & Related papers (2020-04-23T05:00:21Z) - Efficiently Sampling Functions from Gaussian Process Posteriors [76.94808614373609]
We propose an easy-to-use and general-purpose approach for fast posterior sampling.
We demonstrate how decoupled sample paths accurately represent Gaussian process posteriors at a fraction of the usual cost.
arXiv Detail & Related papers (2020-02-21T14:03:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.