Mix and Match: Learning-free Controllable Text Generation using Energy
Language Models
- URL: http://arxiv.org/abs/2203.13299v1
- Date: Thu, 24 Mar 2022 18:52:09 GMT
- Title: Mix and Match: Learning-free Controllable Text Generation using Energy
Language Models
- Authors: Fatemehsadat Mireshghallah, Kartik Goyal, Taylor Berg-Kirkpatrick
- Abstract summary: We propose Mix and Match LM, a global score-based alternative for controllable text generation.
We interpret the task of controllable generation as drawing samples from an energy-based model.
We use a Metropolis-Hastings sampling scheme to sample from this energy-based model.
- Score: 33.97800741890231
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work on controlled text generation has either required attribute-based
fine-tuning of the base language model (LM), or has restricted the
parameterization of the attribute discriminator to be compatible with the base
autoregressive LM. In this work, we propose Mix and Match LM, a global
score-based alternative for controllable text generation that combines
arbitrary pre-trained black-box models for achieving the desired attributes in
the generated text without involving any fine-tuning or structural assumptions
about the black-box models. We interpret the task of controllable generation as
drawing samples from an energy-based model whose energy values are a linear
combination of scores from black-box models that are separately responsible for
fluency, the control attribute, and faithfulness to any conditioning context.
We use a Metropolis-Hastings sampling scheme to sample from this energy-based
model using bidirectional context and global attribute features. We validate
the effectiveness of our approach on various controlled generation and
style-based text revision tasks by outperforming recently proposed methods that
involve extra training, fine-tuning, or restrictive assumptions over the form
of models.
Related papers
- A Block Metropolis-Hastings Sampler for Controllable Energy-based Text
Generation [78.81021361497311]
We develop a novel Metropolis-Hastings (MH) sampler that proposes re-writes of the entire sequence in each step via iterative prompting of a large language model.
Our new sampler allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance.
arXiv Detail & Related papers (2023-12-07T18:30:15Z) - Controlled Text Generation via Language Model Arithmetic [7.687678490751105]
We introduce model arithmetic, a novel inference framework for composing and biasing Large Language Models.
We show that model arithmetic allows fine-grained control of generated text while outperforming state-of-the-art on the task of toxicity reduction.
arXiv Detail & Related papers (2023-11-24T13:41:12Z) - Audio Generation with Multiple Conditional Diffusion Model [15.250081484817324]
We propose a novel model that enhances the controllability of existing pre-trained text-to-audio models.
This approach achieves fine-grained control over the temporal order, pitch, and energy of generated audio.
arXiv Detail & Related papers (2023-08-23T06:21:46Z) - Extrapolating Multilingual Understanding Models as Multilingual
Generators [82.1355802012414]
This paper explores methods to empower multilingual understanding models the generation abilities to get a unified model.
We propose a textbfSemantic-textbfGuided textbfAlignment-then-Denoising (SGA) approach to adapt an encoder to a multilingual generator with a small number of new parameters.
arXiv Detail & Related papers (2023-05-22T15:33:21Z) - DiffusER: Discrete Diffusion via Edit-based Reconstruction [88.62707047517914]
DiffusER is an edit-based generative model for text based on denoising diffusion models.
It can rival autoregressive models on several tasks spanning machine translation, summarization, and style transfer.
It can also perform other varieties of generation that standard autoregressive models are not well-suited for.
arXiv Detail & Related papers (2022-10-30T16:55:23Z) - Controllable Text Generation with Neurally-Decomposed Oracle [91.18959622763055]
We propose a framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO)
We present a closed-form optimal solution to incorporate the token-level guidance into the base model for controllable generation.
arXiv Detail & Related papers (2022-05-27T20:17:53Z) - Controllable and Compositional Generation with Latent-Space Energy-Based
Models [60.87740144816278]
Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications.
In this work, we use energy-based models (EBMs) to handle compositional generation over a set of attributes.
By composing energy functions with logical operators, this work is the first to achieve such compositionality in generating photo-realistic images of resolution 1024x1024.
arXiv Detail & Related papers (2021-10-21T03:31:45Z) - PluGeN: Multi-Label Conditional Generation From Pre-Trained Models [1.4777718769290524]
PluGeN is a simple yet effective generative technique that can be used as a plugin to pre-trained generative models.
We show that PluGeN preserves the quality of backbone models while adding the ability to control the values of labeled attributes.
arXiv Detail & Related papers (2021-09-18T21:02:24Z) - Energy-Based Models for Code Generation under Compilability Constraints [2.9176992922046923]
In this work, we pose the problem of learning to generate compilable code as constraint satisfaction.
We define an Energy-Based Model (EBM) representing a pre-trained generative model with an imposed constraint of generating only compilable sequences.
We then use the KL-Adaptive Distributional Policy Gradient algorithm to train a generative model approxing the EBM.
arXiv Detail & Related papers (2021-06-09T11:06:32Z) - Attribute Alignment: Controlling Text Generation from Pre-trained
Language Models [46.19190007510232]
We propose a simple and flexible method for controlling text generation by aligning disentangled attribute representations.
In contrast to recent efforts on training a discriminator to perturb the token level distribution for an attribute, we use the same data to learn an alignment function to guide the pre-trained, non-controlled language model to generate texts with the target attribute without changing the original language model parameters.
arXiv Detail & Related papers (2021-03-20T01:51:32Z) - Joint Energy-based Model Training for Better Calibrated Natural Language
Understanding Models [61.768082640087]
We explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders for natural language understanding tasks.
Experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines.
arXiv Detail & Related papers (2021-01-18T01:41:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.