Related papers: Constrained Sampling from Language Models via Langevin Dynamics in Embedding Spaces

Constrained Sampling from Language Models via Langevin Dynamics in Embedding Spaces

URL: http://arxiv.org/abs/2205.12558v1
Date: Wed, 25 May 2022 08:09:03 GMT
Title: Constrained Sampling from Language Models via Langevin Dynamics in Embedding Spaces
Authors: Sachin Kumar, Biswajit Paria, Yulia Tsvetkov
Abstract summary: We propose a sampling procedure that combines the log-likelihood of the language model with arbitrary differentiable constraints into a single energy function. We evaluate our approach on different text generation tasks with soft and hard constraints as well as their combinations with competitive results for toxicity avoidance, sentiment control, and keyword-guided generation.
Score: 34.375537557235724
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large pre-trained language models are well-established for their ability to generate text seemingly indistinguishable from humans. In this work, we study the problem of constrained sampling from such language models. That is, generating text that satisfies user-defined constraints. Typical decoding strategies which generate samples left-to-right are not always conducive to imposing such constraints globally. Instead, we propose MuCoLa -- a sampling procedure that combines the log-likelihood of the language model with arbitrary differentiable constraints into a single energy function; and generates samples by initializing the entire output sequence with noise and following a Markov chain defined by Langevin Dynamics using the gradients of this energy. We evaluate our approach on different text generation tasks with soft and hard constraints as well as their combinations with competitive results for toxicity avoidance, sentiment control, and keyword-guided generation.

Related papers

Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive. LCD can distort the global distribution over strings, sampling tokens based only on local information. We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z)
Constrained Language Generation with Discrete Diffusion Models [61.81569616239755]
We present Constrained Discrete Diffusion (CDD), a novel method for enforcing constraints on natural language by integrating discrete diffusion models with differentiable optimization. We show how this technique can be applied to satisfy a variety of natural language constraints, including (i) toxicity mitigation by preventing harmful content from emerging, (ii) character and sequence level lexical constraints, and (iii) novel molecule sequence generation with specific property adherence.
arXiv Detail & Related papers (2025-03-12T19:48:12Z)
Controlled LLM Decoding via Discrete Auto-regressive Biasing [9.843359827321194]
Controlled text generation allows for enforcing user-defined constraints on large language model outputs. We propose Discrete Auto-regressive Biasing, a controlled decoding algorithm that leverages gradients while operating entirely in the discrete text domain. Our method significantly improves constraint satisfaction while maintaining comparable or better fluency, all with even lower computational costs.
arXiv Detail & Related papers (2025-02-06T00:14:43Z)
Conditional [MASK] Discrete Diffusion Language Model [14.208510167132983]
Diffusion-EAGS is a framework that integrates conditional masked language models into diffusion language models. We show that Diffusion-EAGS achieves the best quality-diversity tradeoff, demonstrating its effectiveness in non-autoregressive text generation.
arXiv Detail & Related papers (2024-11-10T11:49:36Z)
Intertwining CP and NLP: The Generation of Unreasonably Constrained Sentences [49.86129209397701]
This paper presents the Constraints First Framework to remedy this issue. It is solved by a constraint programming method that combines linguistic properties with more classical constraints. The effectiveness of this approach is demonstrated by tackling a new more tediously constrained text generation problem.
arXiv Detail & Related papers (2024-06-15T17:40:49Z)
Controlled Text Generation with Natural Language Instructions [74.88938055638636]
InstructCTG is a controlled text generation framework that incorporates different constraints. We first extract the underlying constraints of natural texts through a combination of off-the-shelf NLP tools and simple verbalizes. By prepending natural language descriptions of the constraints and a few demonstrations, we fine-tune a pre-trained language model to incorporate various types of constraints.
arXiv Detail & Related papers (2023-04-27T15:56:34Z)
Tractable Control for Autoregressive Language Generation [82.79160918147852]
We propose to use tractable probabilistic models (TPMs) to impose lexical constraints in autoregressive text generation models. We show that GeLaTo achieves state-of-the-art performance on challenging benchmarks for constrained text generation. Our work opens up new avenues for controlling large language models and also motivates the development of more expressive TPMs.
arXiv Detail & Related papers (2023-04-15T00:19:44Z)
COLD Decoding: Energy-based Constrained Text Generation with Langevin Dynamics [69.8062252611486]
Cold decoding is a flexible framework that can be applied directly to off-the-shelf left-to-right language models. Our experiments on constrained generation tasks point to the effectiveness of our approach, both in terms of automatic and human evaluation.
arXiv Detail & Related papers (2022-02-23T18:59:27Z)
A Contrastive Framework for Neural Text Generation [46.845997620234265]
We show that an underlying reason for model degeneration is the anisotropic distribution of token representations. We present a contrastive solution: (i) SimCTG, a contrastive training objective to calibrate the model's representation space, and (ii) a decoding method -- contrastive search -- to encourage diversity while maintaining coherence in the generated text.
arXiv Detail & Related papers (2022-02-13T21:46:14Z)
Locally Typical Sampling [84.62530743899025]
We show that today's probabilistic language generators fall short when it comes to producing coherent and fluent text.<n>We propose a simple and efficient procedure for enforcing this criterion when generating from probabilistic models.
arXiv Detail & Related papers (2022-02-01T18:58:45Z)
Step-unrolled Denoising Autoencoders for Text Generation [17.015573262373742]
We propose a new generative model of text, Step-unrolled Denoising Autoencoder (SUNDAE) SUNDAE is repeatedly applied on a sequence of tokens, starting from random inputs and improving them each time until convergence. We present a simple new improvement operator that converges in fewer iterations than diffusion methods.
arXiv Detail & Related papers (2021-12-13T16:00:33Z)
NeuroLogic Decoding: (Un)supervised Neural Text Generation with Predicate Logic Constraints [75.66980495245926]
Conditional text generation often requires lexical constraints, i.e., which words should or shouldn't be included in the output text. We propose NeuroLogic Decoding, a simple yet effective algorithm that enables neural language models -- supervised or not -- to generate fluent text. Our results suggest the limit of large-scale neural networks for fine-grained controllable generation and the promise of inference-time algorithms.
arXiv Detail & Related papers (2020-10-24T11:55:22Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.