Related papers: Attribute Alignment: Controlling Text Generation from Pre-trained Language Models

Attribute Alignment: Controlling Text Generation from Pre-trained Language Models

URL: http://arxiv.org/abs/2103.11070v1
Date: Sat, 20 Mar 2021 01:51:32 GMT
Title: Attribute Alignment: Controlling Text Generation from Pre-trained Language Models
Authors: Dian Yu, Kenji Sagae, Zhou Yu
Abstract summary: We propose a simple and flexible method for controlling text generation by aligning disentangled attribute representations. In contrast to recent efforts on training a discriminator to perturb the token level distribution for an attribute, we use the same data to learn an alignment function to guide the pre-trained, non-controlled language model to generate texts with the target attribute without changing the original language model parameters.
Score: 46.19190007510232
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models benefit from training with a large amount of unlabeled text, which gives them increasingly fluent and diverse generation capabilities. However, using these models for text generation that takes into account target attributes, such as sentiment polarity or specific topics, remains a challenge. We propose a simple and flexible method for controlling text generation by aligning disentangled attribute representations. In contrast to recent efforts on training a discriminator to perturb the token level distribution for an attribute, we use the same data to learn an alignment function to guide the pre-trained, non-controlled language model to generate texts with the target attribute without changing the original language model parameters. We evaluate our method on sentiment- and topic-controlled generation, and show large performance gains over previous methods while retaining fluency and diversity.

Related papers

Diffusion Guided Language Modeling [28.819061884362792]
For many applications it is desirable to control attributes, such as sentiment, of the generated language. For auto-regressive language models, existing guidance methods are prone to decoding errors that cascade during generation and degrade performance. In this paper we use a guided diffusion model to produce a latent proposal that steers an auto-regressive language model to generate text with desired properties.
arXiv Detail & Related papers (2024-08-08T05:06:22Z)
Personalized Text Generation with Fine-Grained Linguistic Control [9.668216418094316]
We focus on controlling fine-grained attributes spanning multiple linguistic dimensions. We introduce a novel benchmark to train generative models and evaluate their ability to generate personalized text.
arXiv Detail & Related papers (2024-02-07T14:41:08Z)
Harnessing the Plug-and-Play Controller by Prompting [12.705251690623495]
This paper introduces a novel method for flexible attribute control in text generation using pre-trained language models (PLMs) The proposed approach aims to enhance the fluency of generated text by guiding the generation process with PPCs.
arXiv Detail & Related papers (2024-02-06T17:18:25Z)
Pre-trained Language Models Do Not Help Auto-regressive Text-to-Image Generation [82.5217996570387]
We adapt a pre-trained language model for auto-regressive text-to-image generation. We find that pre-trained language models offer limited help.
arXiv Detail & Related papers (2023-11-27T07:19:26Z)
Successor Features for Efficient Multisubject Controlled Text Generation [48.37713738712319]
We introduce SF-GEN, which is grounded in two primary concepts: successor features (SFs) and language model rectification. SF-GEN seamlessly integrates the two to enable dynamic steering of text generation with no need to alter the LLM's parameters. To the best of our knowledge, our research represents the first application of successor features in text generation.
arXiv Detail & Related papers (2023-11-03T00:17:08Z)
Curriculum-Based Self-Training Makes Better Few-Shot Learners for Data-to-Text Generation [56.98033565736974]
We propose Curriculum-Based Self-Training (CBST) to leverage unlabeled data in a rearranged order determined by the difficulty of text generation. Our method can outperform fine-tuning and task-adaptive pre-training methods, and achieve state-of-the-art performance in the few-shot setting of data-to-text generation.
arXiv Detail & Related papers (2022-06-06T16:11:58Z)
Few-Shot Text Generation with Pattern-Exploiting Training [12.919486518128734]
In this paper, we show that the underlying idea can also be applied to text generation tasks. We adapt Pattern-Exploiting Training (PET), a recently proposed few-shot approach, for finetuning generative language models on text generation tasks.
arXiv Detail & Related papers (2020-12-22T10:53:07Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
Controllable Text Generation with Focused Variation [71.07811310799664]
Focused-Variation Network (FVN) is a novel model to control language generation. FVN learns disjoint discrete latent spaces for each attribute inside codebooks, which allows for both controllability and diversity. We evaluate FVN on two text generation datasets with annotated content and style, and show state-of-the-art performance as assessed by automatic and human evaluations.
arXiv Detail & Related papers (2020-09-25T06:31:06Z)
QURIOUS: Question Generation Pretraining for Text Generation [13.595014409069584]
We propose question generation as a pretraining method, which better aligns with the text generation objectives. Our text generation models pretrained with this method are better at understanding the essence of the input and are better language models for the target task.
arXiv Detail & Related papers (2020-04-23T08:41:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.