Tractable Control for Autoregressive Language Generation
- URL: http://arxiv.org/abs/2304.07438v4
- Date: Wed, 15 Nov 2023 23:16:46 GMT
- Title: Tractable Control for Autoregressive Language Generation
- Authors: Honghua Zhang, Meihua Dang, Nanyun Peng, Guy Van den Broeck
- Abstract summary: We propose to use tractable probabilistic models (TPMs) to impose lexical constraints in autoregressive text generation models.
We show that GeLaTo achieves state-of-the-art performance on challenging benchmarks for constrained text generation.
Our work opens up new avenues for controlling large language models and also motivates the development of more expressive TPMs.
- Score: 82.79160918147852
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the success of autoregressive large language models in text
generation, it remains a major challenge to generate text that satisfies
complex constraints: sampling from the conditional distribution
${\Pr}(\text{text} | \alpha)$ is intractable for even the simplest lexical
constraints $\alpha$. To overcome this challenge, we propose to use tractable
probabilistic models (TPMs) to impose lexical constraints in autoregressive
text generation models, which we refer to as GeLaTo (Generating Language with
Tractable Constraints). To demonstrate the effectiveness of this framework, we
use distilled hidden Markov models, where we can efficiently compute
${\Pr}(\text{text} | \alpha)$, to guide autoregressive generation from GPT2.
GeLaTo achieves state-of-the-art performance on challenging benchmarks for
constrained text generation (e.g., CommonGen), beating various strong baselines
by a large margin. Our work not only opens up new avenues for controlling large
language models but also motivates the development of more expressive TPMs.
Related papers
- Intertwining CP and NLP: The Generation of Unreasonably Constrained Sentences [49.86129209397701]
This paper presents the Constraints First Framework to remedy this issue.
It is solved by a constraint programming method that combines linguistic properties with more classical constraints.
The effectiveness of this approach is demonstrated by tackling a new more tediously constrained text generation problem.
arXiv Detail & Related papers (2024-06-15T17:40:49Z) - Unlocking Anticipatory Text Generation: A Constrained Approach for Large Language Models Decoding [75.06872859716049]
Large Language Models (LLMs) have demonstrated a powerful ability for text generation.
undesired behaviors such as toxicity or hallucinations can manifest.
We propose formalizing text generation as a future-constrained generation problem.
arXiv Detail & Related papers (2023-12-11T06:35:33Z) - Controlled Text Generation via Language Model Arithmetic [7.687678490751105]
We introduce model arithmetic, a novel inference framework for composing and biasing Large Language Models.
We show that model arithmetic allows fine-grained control of generated text while outperforming state-of-the-art on the task of toxicity reduction.
arXiv Detail & Related papers (2023-11-24T13:41:12Z) - Speculative Decoding with Big Little Decoder [108.95187338417541]
Big Little Decoder (BiLD) is a framework that can improve inference efficiency and latency for a wide range of text generation applications.
On an NVIDIA T4 GPU, our framework achieves a speedup of up to 2.12x speedup with minimal generation quality degradation.
Our framework is fully plug-and-play and can be applied without any modifications in the training process or model architecture.
arXiv Detail & Related papers (2023-02-15T18:55:29Z) - Constrained Sampling from Language Models via Langevin Dynamics in
Embedding Spaces [34.375537557235724]
We propose a sampling procedure that combines the log-likelihood of the language model with arbitrary differentiable constraints into a single energy function.
We evaluate our approach on different text generation tasks with soft and hard constraints as well as their combinations with competitive results for toxicity avoidance, sentiment control, and keyword-guided generation.
arXiv Detail & Related papers (2022-05-25T08:09:03Z) - COLD Decoding: Energy-based Constrained Text Generation with Langevin
Dynamics [69.8062252611486]
Cold decoding is a flexible framework that can be applied directly to off-the-shelf left-to-right language models.
Our experiments on constrained generation tasks point to the effectiveness of our approach, both in terms of automatic and human evaluation.
arXiv Detail & Related papers (2022-02-23T18:59:27Z) - Directed Beam Search: Plug-and-Play Lexically Constrained Language
Generation [6.2211479935811775]
State-of-the-art language models are too large to be trained from scratch in a manageable time.
We propose Directed Beam Search (DBS), a plug-and-play method for lexically constrained language generation.
arXiv Detail & Related papers (2020-12-31T03:05:44Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.