Successor Features for Efficient Multisubject Controlled Text Generation
- URL: http://arxiv.org/abs/2311.04921v1
- Date: Fri, 3 Nov 2023 00:17:08 GMT
- Title: Successor Features for Efficient Multisubject Controlled Text Generation
- Authors: Meng Cao, Mehdi Fatemi, Jackie Chi Kit Cheung, Samira Shabanian
- Abstract summary: We introduce SF-GEN, which is grounded in two primary concepts: successor features (SFs) and language model rectification.
SF-GEN seamlessly integrates the two to enable dynamic steering of text generation with no need to alter the LLM's parameters.
To the best of our knowledge, our research represents the first application of successor features in text generation.
- Score: 48.37713738712319
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While large language models (LLMs) have achieved impressive performance in
generating fluent and realistic text, controlling the generated text so that it
exhibits properties such as safety, factuality, and non-toxicity remains
challenging. % such as DExperts, GeDi, and rectification Existing
decoding-based methods are static in terms of the dimension of control; if the
target subject is changed, they require new training. Moreover, it can quickly
become prohibitive to concurrently control multiple subjects. In this work, we
introduce SF-GEN, which is grounded in two primary concepts: successor features
(SFs) to decouple the LLM's dynamics from task-specific rewards, and language
model rectification to proportionally adjust the probability of selecting a
token based on the likelihood that the finished text becomes undesired. SF-GEN
seamlessly integrates the two to enable dynamic steering of text generation
with no need to alter the LLM's parameters. Thanks to the decoupling effect
induced by successor features, our method proves to be memory-wise and
computationally efficient for training as well as decoding, especially when
dealing with multiple target subjects. To the best of our knowledge, our
research represents the first application of successor features in text
generation. In addition to its computational efficiency, the resultant language
produced by our method is comparable to the SOTA (and outperforms baselines) in
both control measures as well as language quality, which we demonstrate through
a series of experiments in various controllable text generation tasks.
Related papers
- Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts.
We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z) - Harnessing the Plug-and-Play Controller by Prompting [12.705251690623495]
This paper introduces a novel method for flexible attribute control in text generation using pre-trained language models (PLMs)
The proposed approach aims to enhance the fluency of generated text by guiding the generation process with PPCs.
arXiv Detail & Related papers (2024-02-06T17:18:25Z) - A Simple yet Efficient Ensemble Approach for AI-generated Text Detection [0.5840089113969194]
Large Language Models (LLMs) have demonstrated remarkable capabilities in generating text that closely resembles human writing.
It is essential to build automated approaches capable of distinguishing between artificially generated text and human-authored text.
We propose a simple yet efficient solution by ensembling predictions from multiple constituent LLMs.
arXiv Detail & Related papers (2023-11-06T13:11:02Z) - KEST: Kernel Distance Based Efficient Self-Training for Improving
Controllable Text Generation [24.47531522553703]
We propose KEST, a novel and efficient self-training framework to handle these problems.
KEST utilizes a kernel-based loss, rather than standard cross entropy, to learn from the soft pseudo text produced by a shared non-autoregressive generator.
Experiments on three controllable generation tasks demonstrate that KEST significantly improves control accuracy while maintaining comparable text fluency and generation diversity against several strong baselines.
arXiv Detail & Related papers (2023-06-17T19:40:57Z) - Controlled Text Generation with Hidden Representation Transformations [12.576140288264835]
CHRT steers large language models to generate text pertaining to certain attributes (such as toxicity)
We employ a contrastive-learning framework to learn these transformations that can be combined to gain multi-attribute control.
CHRT outperforms all the baselines in the task of detoxification, positive sentiment steering, and text simplification.
arXiv Detail & Related papers (2023-05-30T17:21:17Z) - The Whole Truth and Nothing But the Truth: Faithful and Controllable
Dialogue Response Generation with Dataflow Transduction and Constrained
Decoding [65.34601470417967]
We describe a hybrid architecture for dialogue response generation that combines the strengths of neural language modeling and rule-based generation.
Our experiments show that this system outperforms both rule-based and learned approaches in human evaluations of fluency, relevance, and truthfulness.
arXiv Detail & Related papers (2022-09-16T09:00:49Z) - Composable Text Controls in Latent Space with ODEs [97.12426987887021]
This paper proposes a new efficient approach for composable text operations in the compact latent space of text.
By connecting pretrained LMs to the latent space through efficient adaption, we then decode the sampled vectors into desired text sequences.
Experiments show that composing those operators within our approach manages to generate or edit high-quality text.
arXiv Detail & Related papers (2022-08-01T06:51:45Z) - Detecting Text Formality: A Study of Text Classification Approaches [78.11745751651708]
This work proposes the first to our knowledge systematic study of formality detection methods based on statistical, neural-based, and Transformer-based machine learning methods.
We conducted three types of experiments -- monolingual, multilingual, and cross-lingual.
The study shows the overcome of Char BiLSTM model over Transformer-based ones for the monolingual and multilingual formality classification task.
arXiv Detail & Related papers (2022-04-19T16:23:07Z) - Controllable Text Generation with Focused Variation [71.07811310799664]
Focused-Variation Network (FVN) is a novel model to control language generation.
FVN learns disjoint discrete latent spaces for each attribute inside codebooks, which allows for both controllability and diversity.
We evaluate FVN on two text generation datasets with annotated content and style, and show state-of-the-art performance as assessed by automatic and human evaluations.
arXiv Detail & Related papers (2020-09-25T06:31:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.