Priority Sampling of Large Language Models for Compilers
- URL: http://arxiv.org/abs/2402.18734v1
- Date: Wed, 28 Feb 2024 22:27:49 GMT
- Title: Priority Sampling of Large Language Models for Compilers
- Authors: Dejan Grubisic, Chris Cummins, Volker Seeker, Hugh Leather
- Abstract summary: Priority Sampling is a simple and deterministic sampling technique that produces unique samples ordered by the model's confidence.
It supports generation based on regular expression that provides a controllable and structured exploration process.
It outperforms the autotuner used for the generation of labels for the training of the original model in just 30 samples.
- Score: 4.2266182821287135
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models show great potential in generating and optimizing code.
Widely used sampling methods such as Nucleus Sampling increase the diversity of
generation but often produce repeated samples for low temperatures and
incoherent samples for high temperatures. Furthermore, the temperature
coefficient has to be tuned for each task, limiting its usability. We present
Priority Sampling, a simple and deterministic sampling technique that produces
unique samples ordered by the model's confidence. Each new sample expands the
unexpanded token with the highest probability in the augmented search tree.
Additionally, Priority Sampling supports generation based on regular expression
that provides a controllable and structured exploration process. Priority
Sampling outperforms Nucleus Sampling for any number of samples, boosting the
performance of the original model from 2.87% to 5% improvement over -Oz.
Moreover, it outperforms the autotuner used for the generation of labels for
the training of the original model in just 30 samples.
Related papers
- Foundation Model Makes Clustering A Better Initialization For Cold-Start Active Learning [5.609241010973952]
We propose to integrate foundation models with clustering methods to select samples for cold-start active learning.
Foundation models refer to those trained on massive datasets by the self-supervised paradigm.
For a comprehensive comparison, we included a classic ImageNet-supervised model to acquire embeddings.
arXiv Detail & Related papers (2024-02-04T16:27:37Z) - A Block Metropolis-Hastings Sampler for Controllable Energy-based Text
Generation [78.81021361497311]
We develop a novel Metropolis-Hastings (MH) sampler that proposes re-writes of the entire sequence in each step via iterative prompting of a large language model.
Our new sampler allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance.
arXiv Detail & Related papers (2023-12-07T18:30:15Z) - Entropy-based Training Methods for Scalable Neural Implicit Sampler [15.978655106034113]
Efficiently sampling from un-normalized target distributions is a fundamental problem in scientific computing and machine learning.
In this paper, we propose an efficient and scalable neural implicit sampler that overcomes these limitations.
Our sampler can generate large batches of samples with low computational costs by leveraging a neural transformation that directly maps easily sampled latent vectors to target samples.
arXiv Detail & Related papers (2023-06-08T05:56:05Z) - Generating High Fidelity Synthetic Data via Coreset selection and
Entropic Regularization [15.866662428675054]
We propose using a combination of coresets selection methods and entropic regularization'' to select the highest fidelity samples.
In a semi-supervised learning scenario, we show that augmenting the labeled data-set, by adding our selected subset of samples, leads to better accuracy improvement.
arXiv Detail & Related papers (2023-01-31T22:59:41Z) - A Well-Composed Text is Half Done! Composition Sampling for Diverse
Conditional Generation [79.98319703471596]
We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality.
It builds on recently proposed plan-based neural generation models that are trained to first create a composition of the output and then generate by conditioning on it and the input.
arXiv Detail & Related papers (2022-03-28T21:24:03Z) - Boost Test-Time Performance with Closed-Loop Inference [85.43516360332646]
We propose to predict hard-classified test samples in a looped manner to boost the model performance.
We first devise a filtering criterion to identify those hard-classified test samples that need additional inference loops.
For each hard sample, we construct an additional auxiliary learning task based on its original top-$K$ predictions to calibrate the model.
arXiv Detail & Related papers (2022-03-21T10:20:21Z) - Reparameterized Sampling for Generative Adversarial Networks [71.30132908130581]
We propose REP-GAN, a novel sampling method that allows general dependent proposals by REizing the Markov chains into the latent space of the generator.
Empirically, extensive experiments on synthetic and real datasets demonstrate that our REP-GAN largely improves the sample efficiency and obtains better sample quality simultaneously.
arXiv Detail & Related papers (2021-07-01T10:34:55Z) - One for More: Selecting Generalizable Samples for Generalizable ReID
Model [92.40951770273972]
This paper proposes a one-for-more training objective that takes the generalization ability of selected samples as a loss function.
Our proposed one-for-more based sampler can be seamlessly integrated into the ReID training framework.
arXiv Detail & Related papers (2020-12-10T06:37:09Z) - Incremental Sampling Without Replacement for Sequence Models [39.3035292844624]
We present an elegant procedure for sampling without replacement from a broad class of randomized programs.
Our approach is incremental, i.e., samples can be drawn one at a time, allowing for increased flexibility.
arXiv Detail & Related papers (2020-02-21T00:12:01Z) - Generating diverse and natural text-to-speech samples using a quantized
fine-grained VAE and auto-regressive prosody prior [53.69310441063162]
This paper proposes a sequential prior in a discrete latent space which can generate more naturally sounding samples.
We evaluate the approach using listening tests, objective metrics of automatic speech recognition (ASR) performance, and measurements of prosody attributes.
arXiv Detail & Related papers (2020-02-06T12:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.