Related papers: A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation

A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation

URL: http://arxiv.org/abs/2312.04510v1
Date: Thu, 7 Dec 2023 18:30:15 GMT
Title: A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation
Authors: Jarad Forristal, Niloofar Mireshghallah, Greg Durrett, Taylor Berg-Kirkpatrick
Abstract summary: We develop a novel Metropolis-Hastings (MH) sampler that proposes re-writes of the entire sequence in each step via iterative prompting of a large language model. Our new sampler allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance.
Score: 78.81021361497311
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent work has shown that energy-based language modeling is an effective framework for controllable text generation because it enables flexible integration of arbitrary discriminators. However, because energy-based LMs are globally normalized, approximate techniques like Metropolis-Hastings (MH) are required for inference. Past work has largely explored simple proposal distributions that modify a single token at a time, like in Gibbs sampling. In this paper, we develop a novel MH sampler that, in contrast, proposes re-writes of the entire sequence in each step via iterative prompting of a large language model. Our new sampler (a) allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance, as past work has required. We perform experiments on two controlled generation tasks, showing both downstream performance gains and more accurate target distribution sampling in comparison with single-token proposal techniques.

Related papers

On the Query Complexity of Verifier-Assisted Language Generation [35.43462431990329]
We develop a framework for reasoning about constrained generation using a pre-trained language model generator oracle. Access to a verifier can render an intractable problem (information-theoretically or computationally) to a tractable one. We show even simple algorithms, like tokenwise rejection sampling, can enjoy significant benefits from access to a verifier.
arXiv Detail & Related papers (2025-02-17T18:46:32Z)
Principled Gradient-based Markov Chain Monte Carlo for Text Generation [77.46654898866291]
We propose several faithful gradient-based sampling algorithms to sample from the target energy-based text distribution correctly. We demonstrate that faithful samplers are able to generate more fluent text while adhering to the control objectives better.
arXiv Detail & Related papers (2023-12-29T18:00:56Z)
Learning Sampling Distributions for Model Predictive Control [36.82905770866734]
Sampling-based approaches to Model Predictive Control (MPC) have become a cornerstone of contemporary approaches to MPC. We propose to carry out all operations in the latent space, allowing us to take full advantage of the learned distribution. Specifically, we frame the learning problem as bi-level optimization and show how to train the controller with backpropagation-through-time.
arXiv Detail & Related papers (2022-12-05T20:35:36Z)
Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models [65.52639709094963]
Methods such as beam search and Gumbel top-k sampling can guarantee a different output for each element of the beam, but are not easy to parallelize. We present a framework for sampling according to an arithmetic code book implicitly defined by a large language model.
arXiv Detail & Related papers (2022-10-18T22:19:41Z)
Sampling from Discrete Energy-Based Models with Quality/Efficiency Trade-offs [3.491202838583993]
Energy-Based Models (EBMs) allow for extremely flexible specifications of probability distributions. They do not provide a mechanism for obtaining exact samples from these distributions. We propose a new approximate sampling technique, Quasi Rejection Sampling (QRS), that allows for a trade-off between sampling efficiency and sampling quality.
arXiv Detail & Related papers (2021-12-10T17:51:37Z)
Controllable and Compositional Generation with Latent-Space Energy-Based Models [60.87740144816278]
Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications. In this work, we use energy-based models (EBMs) to handle compositional generation over a set of attributes. By composing energy functions with logical operators, this work is the first to achieve such compositionality in generating photo-realistic images of resolution 1024x1024.
arXiv Detail & Related papers (2021-10-21T03:31:45Z)
COAST: COntrollable Arbitrary-Sampling NeTwork for Compressive Sensing [27.870537087888334]
We propose a novel Arbitrary-Sampling neTwork, dubbed COAST, to solve problems of arbitrary-sampling (including unseen sampling matrices) with one single model. COAST is able to handle arbitrary sampling matrices with one single model and to achieve state-of-the-art performance with fast speed.
arXiv Detail & Related papers (2021-07-15T10:05:00Z)
Reparameterized Sampling for Generative Adversarial Networks [71.30132908130581]
We propose REP-GAN, a novel sampling method that allows general dependent proposals by REizing the Markov chains into the latent space of the generator. Empirically, extensive experiments on synthetic and real datasets demonstrate that our REP-GAN largely improves the sample efficiency and obtains better sample quality simultaneously.
arXiv Detail & Related papers (2021-07-01T10:34:55Z)
Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis--Hastings [57.133639209759615]
We interpret sequences as energy-based sequence models and propose two energy parametrizations derivable from traineds. We develop a tractable emph scheme based on the Metropolis-Hastings Monte Carlo algorithm. We validate the effectiveness of the proposed parametrizations by exploring the quality of samples drawn from these energy-based models.
arXiv Detail & Related papers (2021-06-04T22:04:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.