Related papers: Mix and Match: Learning-free Controllable Text Generation using Energy Language Models

Mix and Match: Learning-free Controllable Text Generation using Energy Language Models

URL: http://arxiv.org/abs/2203.13299v1
Date: Thu, 24 Mar 2022 18:52:09 GMT
Title: Mix and Match: Learning-free Controllable Text Generation using Energy Language Models
Authors: Fatemehsadat Mireshghallah, Kartik Goyal, Taylor Berg-Kirkpatrick
Abstract summary: We propose Mix and Match LM, a global score-based alternative for controllable text generation. We interpret the task of controllable generation as drawing samples from an energy-based model. We use a Metropolis-Hastings sampling scheme to sample from this energy-based model.
Score: 33.97800741890231
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent work on controlled text generation has either required attribute-based fine-tuning of the base language model (LM), or has restricted the parameterization of the attribute discriminator to be compatible with the base autoregressive LM. In this work, we propose Mix and Match LM, a global score-based alternative for controllable text generation that combines arbitrary pre-trained black-box models for achieving the desired attributes in the generated text without involving any fine-tuning or structural assumptions about the black-box models. We interpret the task of controllable generation as drawing samples from an energy-based model whose energy values are a linear combination of scores from black-box models that are separately responsible for fluency, the control attribute, and faithfulness to any conditioning context. We use a Metropolis-Hastings sampling scheme to sample from this energy-based model using bidirectional context and global attribute features. We validate the effectiveness of our approach on various controlled generation and style-based text revision tasks by outperforming recently proposed methods that involve extra training, fine-tuning, or restrictive assumptions over the form of models.

Related papers

Palette of Language Models: A Solver for Controlled Text Generation [20.774257685046994]
Large language models can produce controlled texts that closely adhere to specific requirements when prompted appropriately. A common approach is to linearly combine single-attribute models, but this strategy often overlooks attribute overlaps and can lead to conflicts. We propose a novel combination strategy inspired by the Law of Total Probability and Conditional Mutual Information Minimization on generative language models.
arXiv Detail & Related papers (2025-03-14T08:30:09Z)
A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation [78.81021361497311]
We develop a novel Metropolis-Hastings (MH) sampler that proposes re-writes of the entire sequence in each step via iterative prompting of a large language model. Our new sampler allows for more efficient and accurate sampling from a target distribution and (b) allows generation length to be determined through the sampling procedure rather than fixed in advance.
arXiv Detail & Related papers (2023-12-07T18:30:15Z)
Controlled Text Generation via Language Model Arithmetic [7.687678490751105]
We introduce model arithmetic, a novel inference framework for composing and biasing Large Language Models. We show that model arithmetic allows fine-grained control of generated text while outperforming state-of-the-art on the task of toxicity reduction.
arXiv Detail & Related papers (2023-11-24T13:41:12Z)
Audio Generation with Multiple Conditional Diffusion Model [15.250081484817324]
We propose a novel model that enhances the controllability of existing pre-trained text-to-audio models. This approach achieves fine-grained control over the temporal order, pitch, and energy of generated audio.
arXiv Detail & Related papers (2023-08-23T06:21:46Z)
Extrapolating Multilingual Understanding Models as Multilingual Generators [82.1355802012414]
This paper explores methods to empower multilingual understanding models the generation abilities to get a unified model. We propose a textbfSemantic-textbfGuided textbfAlignment-then-Denoising (SGA) approach to adapt an encoder to a multilingual generator with a small number of new parameters.
arXiv Detail & Related papers (2023-05-22T15:33:21Z)
DiffusER: Discrete Diffusion via Edit-based Reconstruction [88.62707047517914]
DiffusER is an edit-based generative model for text based on denoising diffusion models. It can rival autoregressive models on several tasks spanning machine translation, summarization, and style transfer. It can also perform other varieties of generation that standard autoregressive models are not well-suited for.
arXiv Detail & Related papers (2022-10-30T16:55:23Z)
Controllable Text Generation with Neurally-Decomposed Oracle [91.18959622763055]
We propose a framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO) We present a closed-form optimal solution to incorporate the token-level guidance into the base model for controllable generation.
arXiv Detail & Related papers (2022-05-27T20:17:53Z)
Controllable and Compositional Generation with Latent-Space Energy-Based Models [60.87740144816278]
Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications. In this work, we use energy-based models (EBMs) to handle compositional generation over a set of attributes. By composing energy functions with logical operators, this work is the first to achieve such compositionality in generating photo-realistic images of resolution 1024x1024.
arXiv Detail & Related papers (2021-10-21T03:31:45Z)
PluGeN: Multi-Label Conditional Generation From Pre-Trained Models [1.4777718769290524]
PluGeN is a simple yet effective generative technique that can be used as a plugin to pre-trained generative models. We show that PluGeN preserves the quality of backbone models while adding the ability to control the values of labeled attributes.
arXiv Detail & Related papers (2021-09-18T21:02:24Z)
Energy-Based Models for Code Generation under Compilability Constraints [2.9176992922046923]
In this work, we pose the problem of learning to generate compilable code as constraint satisfaction. We define an Energy-Based Model (EBM) representing a pre-trained generative model with an imposed constraint of generating only compilable sequences. We then use the KL-Adaptive Distributional Policy Gradient algorithm to train a generative model approxing the EBM.
arXiv Detail & Related papers (2021-06-09T11:06:32Z)
Attribute Alignment: Controlling Text Generation from Pre-trained Language Models [46.19190007510232]
We propose a simple and flexible method for controlling text generation by aligning disentangled attribute representations. In contrast to recent efforts on training a discriminator to perturb the token level distribution for an attribute, we use the same data to learn an alignment function to guide the pre-trained, non-controlled language model to generate texts with the target attribute without changing the original language model parameters.
arXiv Detail & Related papers (2021-03-20T01:51:32Z)
Joint Energy-based Model Training for Better Calibrated Natural Language Understanding Models [61.768082640087]
We explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders for natural language understanding tasks. Experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines.
arXiv Detail & Related papers (2021-01-18T01:41:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.