Constrained Sampling from Language Models via Langevin Dynamics in
Embedding Spaces
- URL: http://arxiv.org/abs/2205.12558v1
- Date: Wed, 25 May 2022 08:09:03 GMT
- Title: Constrained Sampling from Language Models via Langevin Dynamics in
Embedding Spaces
- Authors: Sachin Kumar, Biswajit Paria, Yulia Tsvetkov
- Abstract summary: We propose a sampling procedure that combines the log-likelihood of the language model with arbitrary differentiable constraints into a single energy function.
We evaluate our approach on different text generation tasks with soft and hard constraints as well as their combinations with competitive results for toxicity avoidance, sentiment control, and keyword-guided generation.
- Score: 34.375537557235724
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large pre-trained language models are well-established for their ability to
generate text seemingly indistinguishable from humans. In this work, we study
the problem of constrained sampling from such language models. That is,
generating text that satisfies user-defined constraints. Typical decoding
strategies which generate samples left-to-right are not always conducive to
imposing such constraints globally. Instead, we propose MuCoLa -- a sampling
procedure that combines the log-likelihood of the language model with arbitrary
differentiable constraints into a single energy function; and generates samples
by initializing the entire output sequence with noise and following a Markov
chain defined by Langevin Dynamics using the gradients of this energy. We
evaluate our approach on different text generation tasks with soft and hard
constraints as well as their combinations with competitive results for toxicity
avoidance, sentiment control, and keyword-guided generation.
Related papers
- Controlled LLM Decoding via Discrete Auto-regressive Biasing [9.843359827321194]
Controlled text generation allows for enforcing user-defined constraints on large language model outputs.
We propose Discrete Auto-regressive Biasing, a controlled decoding algorithm that leverages gradients while operating entirely in the discrete text domain.
Our method significantly improves constraint satisfaction while maintaining comparable or better fluency, all with even lower computational costs.
arXiv Detail & Related papers (2025-02-06T00:14:43Z) - Conditional [MASK] Discrete Diffusion Language Model [14.208510167132983]
Diffusion-EAGS is a framework that integrates conditional masked language models into diffusion language models.
We show that Diffusion-EAGS achieves the best quality-diversity tradeoff, demonstrating its effectiveness in non-autoregressive text generation.
arXiv Detail & Related papers (2024-11-10T11:49:36Z) - Controlled Text Generation with Natural Language Instructions [74.88938055638636]
InstructCTG is a controlled text generation framework that incorporates different constraints.
We first extract the underlying constraints of natural texts through a combination of off-the-shelf NLP tools and simple verbalizes.
By prepending natural language descriptions of the constraints and a few demonstrations, we fine-tune a pre-trained language model to incorporate various types of constraints.
arXiv Detail & Related papers (2023-04-27T15:56:34Z) - Tractable Control for Autoregressive Language Generation [82.79160918147852]
We propose to use tractable probabilistic models (TPMs) to impose lexical constraints in autoregressive text generation models.
We show that GeLaTo achieves state-of-the-art performance on challenging benchmarks for constrained text generation.
Our work opens up new avenues for controlling large language models and also motivates the development of more expressive TPMs.
arXiv Detail & Related papers (2023-04-15T00:19:44Z) - COLD Decoding: Energy-based Constrained Text Generation with Langevin
Dynamics [69.8062252611486]
Cold decoding is a flexible framework that can be applied directly to off-the-shelf left-to-right language models.
Our experiments on constrained generation tasks point to the effectiveness of our approach, both in terms of automatic and human evaluation.
arXiv Detail & Related papers (2022-02-23T18:59:27Z) - A Contrastive Framework for Neural Text Generation [46.845997620234265]
We show that an underlying reason for model degeneration is the anisotropic distribution of token representations.
We present a contrastive solution: (i) SimCTG, a contrastive training objective to calibrate the model's representation space, and (ii) a decoding method -- contrastive search -- to encourage diversity while maintaining coherence in the generated text.
arXiv Detail & Related papers (2022-02-13T21:46:14Z) - Typical Decoding for Natural Language Generation [76.69397802617064]
We study why high-probability texts can be dull or repetitive.
We show that typical sampling offers competitive performance in terms of quality.
arXiv Detail & Related papers (2022-02-01T18:58:45Z) - NeuroLogic Decoding: (Un)supervised Neural Text Generation with
Predicate Logic Constraints [75.66980495245926]
Conditional text generation often requires lexical constraints, i.e., which words should or shouldn't be included in the output text.
We propose NeuroLogic Decoding, a simple yet effective algorithm that enables neural language models -- supervised or not -- to generate fluent text.
Our results suggest the limit of large-scale neural networks for fine-grained controllable generation and the promise of inference-time algorithms.
arXiv Detail & Related papers (2020-10-24T11:55:22Z) - Improve Variational Autoencoder for Text Generationwith Discrete Latent
Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning.
VAEs tend to ignore latent variables with a strong auto-regressive decoder.
We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.