A Block Metropolis-Hastings Sampler for Controllable Energy-based Text
Generation
- URL: http://arxiv.org/abs/2312.04510v1
- Date: Thu, 7 Dec 2023 18:30:15 GMT
- Title: A Block Metropolis-Hastings Sampler for Controllable Energy-based Text
Generation
- Authors: Jarad Forristal, Niloofar Mireshghallah, Greg Durrett, Taylor
Berg-Kirkpatrick
- Abstract summary: We develop a novel Metropolis-Hastings (MH) sampler that proposes re-writes of the entire sequence in each step via iterative prompting of a large language model.
Our new sampler allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance.
- Score: 78.81021361497311
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work has shown that energy-based language modeling is an effective
framework for controllable text generation because it enables flexible
integration of arbitrary discriminators. However, because energy-based LMs are
globally normalized, approximate techniques like Metropolis-Hastings (MH) are
required for inference. Past work has largely explored simple proposal
distributions that modify a single token at a time, like in Gibbs sampling. In
this paper, we develop a novel MH sampler that, in contrast, proposes re-writes
of the entire sequence in each step via iterative prompting of a large language
model. Our new sampler (a) allows for more efficient and accurate sampling from
a target distribution and (b) allows generation length to be determined through
the sampling procedure rather than fixed in advance, as past work has required.
We perform experiments on two controlled generation tasks, showing both
downstream performance gains and more accurate target distribution sampling in
comparison with single-token proposal techniques.
Related papers
- Principled Gradient-based Markov Chain Monte Carlo for Text Generation [77.46654898866291]
We propose several faithful gradient-based sampling algorithms to sample from the target energy-based text distribution correctly.
We demonstrate that faithful samplers are able to generate more fluent text while adhering to the control objectives better.
arXiv Detail & Related papers (2023-12-29T18:00:56Z) - Learning Sampling Distributions for Model Predictive Control [36.82905770866734]
Sampling-based approaches to Model Predictive Control (MPC) have become a cornerstone of contemporary approaches to MPC.
We propose to carry out all operations in the latent space, allowing us to take full advantage of the learned distribution.
Specifically, we frame the learning problem as bi-level optimization and show how to train the controller with backpropagation-through-time.
arXiv Detail & Related papers (2022-12-05T20:35:36Z) - Arithmetic Sampling: Parallel Diverse Decoding for Large Language Models [65.52639709094963]
Methods such as beam search and Gumbel top-k sampling can guarantee a different output for each element of the beam, but are not easy to parallelize.
We present a framework for sampling according to an arithmetic code book implicitly defined by a large language model.
arXiv Detail & Related papers (2022-10-18T22:19:41Z) - Sampling from Discrete Energy-Based Models with Quality/Efficiency
Trade-offs [3.491202838583993]
Energy-Based Models (EBMs) allow for extremely flexible specifications of probability distributions.
They do not provide a mechanism for obtaining exact samples from these distributions.
We propose a new approximate sampling technique, Quasi Rejection Sampling (QRS), that allows for a trade-off between sampling efficiency and sampling quality.
arXiv Detail & Related papers (2021-12-10T17:51:37Z) - Controllable and Compositional Generation with Latent-Space Energy-Based
Models [60.87740144816278]
Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications.
In this work, we use energy-based models (EBMs) to handle compositional generation over a set of attributes.
By composing energy functions with logical operators, this work is the first to achieve such compositionality in generating photo-realistic images of resolution 1024x1024.
arXiv Detail & Related papers (2021-10-21T03:31:45Z) - COAST: COntrollable Arbitrary-Sampling NeTwork for Compressive Sensing [27.870537087888334]
We propose a novel Arbitrary-Sampling neTwork, dubbed COAST, to solve problems of arbitrary-sampling (including unseen sampling matrices) with one single model.
COAST is able to handle arbitrary sampling matrices with one single model and to achieve state-of-the-art performance with fast speed.
arXiv Detail & Related papers (2021-07-15T10:05:00Z) - Reparameterized Sampling for Generative Adversarial Networks [71.30132908130581]
We propose REP-GAN, a novel sampling method that allows general dependent proposals by REizing the Markov chains into the latent space of the generator.
Empirically, extensive experiments on synthetic and real datasets demonstrate that our REP-GAN largely improves the sample efficiency and obtains better sample quality simultaneously.
arXiv Detail & Related papers (2021-07-01T10:34:55Z) - Exposing the Implicit Energy Networks behind Masked Language Models via
Metropolis--Hastings [57.133639209759615]
We interpret sequences as energy-based sequence models and propose two energy parametrizations derivable from traineds.
We develop a tractable emph scheme based on the Metropolis-Hastings Monte Carlo algorithm.
We validate the effectiveness of the proposed parametrizations by exploring the quality of samples drawn from these energy-based models.
arXiv Detail & Related papers (2021-06-04T22:04:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.