Sampling with Attribute-Related Information for Controlling Language
Models
- URL: http://arxiv.org/abs/2205.06036v1
- Date: Thu, 12 May 2022 11:48:11 GMT
- Title: Sampling with Attribute-Related Information for Controlling Language
Models
- Authors: Shangda Wu, Maosong Sun
- Abstract summary: We propose a new simple guided decoding method, Gamma Sampling, which does not require complex engineering and any extra data.
Gamma Sampling introduces attribute-related information into the sampling process to guide language models to generate texts with desired attributes.
Experiments on controlling topics and sentiments of generated text show Gamma Sampling to be superior in diversity, attribute relevance and overall quality of generated samples.
- Score: 86.72661027591394
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The dominant approaches for controlling language models are based on
fine-tuning large language models or prompt engineering. However, these methods
often require condition-specific data or considerable hand-crafting. We propose
a new simple guided decoding method, Gamma Sampling, which does not require
complex engineering and any extra data. Gamma Sampling introduces
attribute-related information (provided by humans or language models
themselves) into the sampling process to guide language models to generate
texts with desired attributes. Experiments on controlling topics and sentiments
of generated text show Gamma Sampling to be superior in diversity, attribute
relevance and overall quality of generated samples while maintaining a fast
generation speed. In addition, we successfully applied Gamma Sampling to
control other attributes of language such as relatedness and repetition, which
further demonstrates the versatility and effectiveness of this method. Gamma
Sampling is now available in the python package samplings via import gamma
sampling from samplings.
Related papers
- Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation [60.493180081319785]
We propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step.
Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.
arXiv Detail & Related papers (2024-08-24T14:14:32Z) - Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs [4.122612309805664]
Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step.
We propose min-p sampling, a dynamic truncation method that adjusts the sampling threshold based on the model's confidence by scaling according to the top token's probability.
We conduct extensive experiments on benchmarks including GPQA, GSM8K, and AlpacaEval Creative Writing, demonstrating that min-p sampling improves both the quality and diversity of generated text, particularly at high temperatures.
arXiv Detail & Related papers (2024-07-01T08:37:25Z) - Unsupervised Calibration through Prior Adaptation for Text
Classification using Large Language Models [37.39843935632105]
We propose an approach to adapt the prior class distribution to perform text classification tasks without the need for labelled samples.
Results show that these methods outperform the un-adapted model for different number of training shots in the prompt.
arXiv Detail & Related papers (2023-07-13T12:11:36Z) - Extrapolating Multilingual Understanding Models as Multilingual
Generators [82.1355802012414]
This paper explores methods to empower multilingual understanding models the generation abilities to get a unified model.
We propose a textbfSemantic-textbfGuided textbfAlignment-then-Denoising (SGA) approach to adapt an encoder to a multilingual generator with a small number of new parameters.
arXiv Detail & Related papers (2023-05-22T15:33:21Z) - MacLaSa: Multi-Aspect Controllable Text Generation via Efficient
Sampling from Compact Latent Space [110.85888003111653]
Multi-aspect controllable text generation aims to generate fluent sentences that possess multiple desired attributes simultaneously.
We introduce a novel approach for multi-aspect control, namely MacLaSa, that estimates compact latent space for multiple aspects.
We show that MacLaSa outperforms several strong baselines on attribute relevance and textual quality while maintaining a high inference speed.
arXiv Detail & Related papers (2023-05-22T07:30:35Z) - Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models [65.52639709094963]
Methods such as beam search and Gumbel top-k sampling can guarantee a different output for each element of the beam, but are not easy to parallelize.
We present a framework for sampling according to an arithmetic code book implicitly defined by a large language model.
arXiv Detail & Related papers (2022-10-18T22:19:41Z) - Step-unrolled Denoising Autoencoders for Text Generation [17.015573262373742]
We propose a new generative model of text, Step-unrolled Denoising Autoencoder (SUNDAE)
SUNDAE is repeatedly applied on a sequence of tokens, starting from random inputs and improving them each time until convergence.
We present a simple new improvement operator that converges in fewer iterations than diffusion methods.
arXiv Detail & Related papers (2021-12-13T16:00:33Z) - Attribute Alignment: Controlling Text Generation from Pre-trained
Language Models [46.19190007510232]
We propose a simple and flexible method for controlling text generation by aligning disentangled attribute representations.
In contrast to recent efforts on training a discriminator to perturb the token level distribution for an attribute, we use the same data to learn an alignment function to guide the pre-trained, non-controlled language model to generate texts with the target attribute without changing the original language model parameters.
arXiv Detail & Related papers (2021-03-20T01:51:32Z) - Generating diverse and natural text-to-speech samples using a quantized
fine-grained VAE and auto-regressive prosody prior [53.69310441063162]
This paper proposes a sequential prior in a discrete latent space which can generate more naturally sounding samples.
We evaluate the approach using listening tests, objective metrics of automatic speech recognition (ASR) performance, and measurements of prosody attributes.
arXiv Detail & Related papers (2020-02-06T12:35:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.