Principled Gradient-based Markov Chain Monte Carlo for Text Generation
- URL: http://arxiv.org/abs/2312.17710v1
- Date: Fri, 29 Dec 2023 18:00:56 GMT
- Title: Principled Gradient-based Markov Chain Monte Carlo for Text Generation
- Authors: Li Du, Afra Amini, Lucas Torroba Hennigen, Xinyan Velocity Yu, Jason
Eisner, Holden Lee, Ryan Cotterell
- Abstract summary: We propose several faithful gradient-based sampling algorithms to sample from the target energy-based text distribution correctly.
We demonstrate that faithful samplers are able to generate more fluent text while adhering to the control objectives better.
- Score: 77.46654898866291
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent papers have demonstrated the possibility of energy-based text
generation by adapting gradient-based sampling algorithms, a paradigm of MCMC
algorithms that promises fast convergence. However, as we show in this paper,
previous attempts on this approach to text generation all fail to sample
correctly from the target language model distributions. To address this
limitation, we consider the problem of designing text samplers that are
faithful, meaning that they have the target text distribution as its limiting
distribution. We propose several faithful gradient-based sampling algorithms to
sample from the target energy-based text distribution correctly, and study
their theoretical properties. Through experiments on various forms of text
generation, we demonstrate that faithful samplers are able to generate more
fluent text while adhering to the control objectives better.
Related papers
- A Block Metropolis-Hastings Sampler for Controllable Energy-based Text
Generation [78.81021361497311]
We develop a novel Metropolis-Hastings (MH) sampler that proposes re-writes of the entire sequence in each step via iterative prompting of a large language model.
Our new sampler allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance.
arXiv Detail & Related papers (2023-12-07T18:30:15Z) - Language Model Decoding as Direct Metrics Optimization [87.68281625776282]
Current decoding methods struggle to generate texts that align with human texts across different aspects.
In this work, we frame decoding from a language model as an optimization problem with the goal of strictly matching the expected performance with human texts.
We prove that this induced distribution is guaranteed to improve the perplexity on human texts, which suggests a better approximation to the underlying distribution of human texts.
arXiv Detail & Related papers (2023-10-02T09:35:27Z) - Structured Voronoi Sampling [61.629198273926676]
In this paper, we take an important step toward building a principled approach for sampling from language models with gradient-based methods.
We name our gradient-based technique Structured Voronoi Sampling (SVS)
In a controlled generation task, SVS is able to generate fluent and diverse samples while following the control targets significantly better than other methods.
arXiv Detail & Related papers (2023-06-05T17:32:35Z) - Text-Conditioned Sampling Framework for Text-to-Image Generation with
Masked Generative Models [52.29800567587504]
We propose a learnable sampling model, Text-Conditioned Token Selection (TCTS), to select optimal tokens via localized supervision with text information.
TCTS improves not only the image quality but also the semantic alignment of the generated images with the given texts.
We validate the efficacy of TCTS combined with Frequency Adaptive Sampling (FAS) with various generative tasks, demonstrating that it significantly outperforms the baselines in image-text alignment and image quality.
arXiv Detail & Related papers (2023-04-04T03:52:49Z) - A Well-Composed Text is Half Done! Composition Sampling for Diverse
Conditional Generation [79.98319703471596]
We propose Composition Sampling, a simple but effective method to generate diverse outputs for conditional generation of higher quality.
It builds on recently proposed plan-based neural generation models that are trained to first create a composition of the output and then generate by conditioning on it and the input.
arXiv Detail & Related papers (2022-03-28T21:24:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.