Min P Sampling: Balancing Creativity and Coherence at High Temperature
- URL: http://arxiv.org/abs/2407.01082v1
- Date: Mon, 1 Jul 2024 08:37:25 GMT
- Title: Min P Sampling: Balancing Creativity and Coherence at High Temperature
- Authors: Minh Nguyen, Andrew Baker, Andreas Kirsch, Clement Neo,
- Abstract summary: min-$p$ is a dynamic truncation sampling method that scales according to the probability of the top candidate token.
We demonstrate that min-$p$ improves the coherence and quality of generated text even at high temperatures.
- Score: 2.6639520483183867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) generate longform text by successively sampling the next token based on the probability distribution of the token vocabulary at each decoding step. Current popular truncation sampling methods such as top-$p$ sampling, also known as nucleus sampling, often struggle to balance coherence and creativity in generating text, particularly when using higher temperatures. To address this issue, we propose min-$p$, a dynamic truncation sampling method, that establishes a minimum base percentage threshold for tokens, which the scales according to the probability of the top candidate token. Through experiments on several benchmarks, such as GPQA, GSM8K and AlpacaEval Creative Writing, we demonstrate that min-$p$ improves the coherence and quality of generated text even at high temperatures, while also facilitating more creative and diverse outputs compared to top-$p$ and other sampling methods. As of writing, min-$p$ has been adopted by multiple open-source LLM implementations, and have been independently assessed by members of the open-source LLM community, further validating its practical utility and potential.
Related papers
- Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation [60.493180081319785]
We propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step.
Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.
arXiv Detail & Related papers (2024-08-24T14:14:32Z) - REAL Sampling: Boosting Factuality and Diversity of Open-Ended Generation via Asymptotic Entropy [93.8400683020273]
Decoding methods for large language models (LLMs) usually struggle with the tradeoff between ensuring factuality and maintaining diversity.
We propose REAL sampling, a decoding method that improved factuality and diversity over nucleus sampling.
arXiv Detail & Related papers (2024-06-11T21:44:49Z) - Priority Sampling of Large Language Models for Compilers [4.2266182821287135]
Priority Sampling is a simple and deterministic sampling technique that produces unique samples ordered by the model's confidence.
It supports generation based on regular expression that provides a controllable and structured exploration process.
It outperforms the autotuner used for the generation of labels for the training of the original model in just 30 samples.
arXiv Detail & Related papers (2024-02-28T22:27:49Z) - A Block Metropolis-Hastings Sampler for Controllable Energy-based Text
Generation [78.81021361497311]
We develop a novel Metropolis-Hastings (MH) sampler that proposes re-writes of the entire sequence in each step via iterative prompting of a large language model.
Our new sampler allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance.
arXiv Detail & Related papers (2023-12-07T18:30:15Z) - An Empirical Study of Translation Hypothesis Ensembling with Large
Language Models [9.068791020917217]
Large language models (LLMs) are becoming a one-fits-many solution, but they sometimes hallucinate or produce unreliable output.
We investigate how hypothesis ensembling can improve the quality of the generated text.
arXiv Detail & Related papers (2023-10-17T17:40:21Z) - Epsilon Sampling Rocks: Investigating Sampling Strategies for Minimum
Bayes Risk Decoding for Machine Translation [20.749494856466526]
We show how different sampling approaches for generating candidate lists for Minimum Bayes Risk decoding affect performance.
Based on our insights into their limitations, we experiment with the recently proposed epsilon-sampling approach, which prunes away all tokens with a probability smaller than epsilon.
arXiv Detail & Related papers (2023-05-17T00:11:38Z) - Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models [65.52639709094963]
Methods such as beam search and Gumbel top-k sampling can guarantee a different output for each element of the beam, but are not easy to parallelize.
We present a framework for sampling according to an arithmetic code book implicitly defined by a large language model.
arXiv Detail & Related papers (2022-10-18T22:19:41Z) - A Well-Composed Text is Half Done! Composition Sampling for Diverse
Conditional Generation [79.98319703471596]
We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality.
It builds on recently proposed plan-based neural generation models that are trained to first create a composition of the output and then generate by conditioning on it and the input.
arXiv Detail & Related papers (2022-03-28T21:24:03Z) - Generating diverse and natural text-to-speech samples using a quantized
fine-grained VAE and auto-regressive prosody prior [53.69310441063162]
This paper proposes a sequential prior in a discrete latent space which can generate more naturally sounding samples.
We evaluate the approach using listening tests, objective metrics of automatic speech recognition (ASR) performance, and measurements of prosody attributes.
arXiv Detail & Related papers (2020-02-06T12:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.