Controlled Text Generation with Hidden Representation Transformations
- URL: http://arxiv.org/abs/2305.19230v2
- Date: Wed, 31 May 2023 17:27:03 GMT
- Title: Controlled Text Generation with Hidden Representation Transformations
- Authors: Vaibhav Kumar, Hana Koorehdavoudi, Masud Moshtaghi, Amita Misra, Ankit
Chadha, Emilio Ferrara
- Abstract summary: CHRT steers large language models to generate text pertaining to certain attributes (such as toxicity)
We employ a contrastive-learning framework to learn these transformations that can be combined to gain multi-attribute control.
CHRT outperforms all the baselines in the task of detoxification, positive sentiment steering, and text simplification.
- Score: 12.576140288264835
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We propose CHRT (Control Hidden Representation Transformation) - a controlled
language generation framework that steers large language models to generate
text pertaining to certain attributes (such as toxicity). CHRT gains attribute
control by modifying the hidden representation of the base model through
learned transformations. We employ a contrastive-learning framework to learn
these transformations that can be combined to gain multi-attribute control. The
effectiveness of CHRT is experimentally shown by comparing it with seven
baselines over three attributes. CHRT outperforms all the baselines in the task
of detoxification, positive sentiment steering, and text simplification while
minimizing the loss in linguistic qualities. Further, our approach has the
lowest inference latency of only 0.01 seconds more than the base model, making
it the most suitable for high-performance production environments. We
open-source our code and release two novel datasets to further propel
controlled language generation research.
Related papers
- Detecting Machine-Generated Long-Form Content with Latent-Space Variables [54.07946647012579]
Existing zero-shot detectors primarily focus on token-level distributions, which are vulnerable to real-world domain shifts.
We propose a more robust method that incorporates abstract elements, such as event transitions, as key deciding factors to detect machine versus human texts.
arXiv Detail & Related papers (2024-10-04T18:42:09Z) - Successor Features for Efficient Multisubject Controlled Text Generation [48.37713738712319]
We introduce SF-GEN, which is grounded in two primary concepts: successor features (SFs) and language model rectification.
SF-GEN seamlessly integrates the two to enable dynamic steering of text generation with no need to alter the LLM's parameters.
To the best of our knowledge, our research represents the first application of successor features in text generation.
arXiv Detail & Related papers (2023-11-03T00:17:08Z) - Controllable Data Augmentation for Few-Shot Text Mining with Chain-of-Thought Attribute Manipulation [35.33340453046864]
Chain-of-Thought Attribute Manipulation (CoTAM) is a novel approach that generates new data from existing examples.
We leverage the chain-of-thought prompting to directly edit the text in three steps, (1) attribute decomposition, (2) manipulation proposal, and (3) sentence reconstruction.
arXiv Detail & Related papers (2023-07-14T00:10:03Z) - Click: Controllable Text Generation with Sequence Likelihood Contrastive
Learning [69.35360098882606]
We introduce Click for controllable text generation, which needs no modification to the model architecture.
It employs a contrastive loss on sequence likelihood, which fundamentally decreases the generation probability of negative samples.
It also adopts a novel likelihood ranking-based strategy to construct contrastive samples from model generations.
arXiv Detail & Related papers (2023-06-06T01:56:44Z) - Code-Switching Text Generation and Injection in Mandarin-English ASR [57.57570417273262]
We investigate text generation and injection for improving the performance of an industry commonly-used streaming model, Transformer-Transducer (T-T)
We first propose a strategy to generate code-switching text data and then investigate injecting generated text into T-T model explicitly by Text-To-Speech (TTS) conversion or implicitly by tying speech and text latent spaces.
Experimental results on the T-T model trained with a dataset containing 1,800 hours of real Mandarin-English code-switched speech show that our approaches to inject generated code-switching text significantly boost the performance of T-T models.
arXiv Detail & Related papers (2023-03-20T09:13:27Z) - FAST: Improving Controllability for Text Generation with Feedback Aware
Self-Training [25.75982440355576]
Controllable text generation systems often leverage control codes to direct various properties of the output like style and length.
Inspired by recent work on causal inference for NLP, this paper reveals a previously overlooked flaw in these control code-based conditional text generation algorithms.
We propose two simple techniques to reduce these correlations in training sets.
arXiv Detail & Related papers (2022-10-06T19:00:51Z) - XDBERT: Distilling Visual Information to BERT from Cross-Modal Systems
to Improve Language Understanding [73.24847320536813]
This study explores distilling visual information from pretrained multimodal transformers to pretrained language encoders.
Our framework is inspired by cross-modal encoders' success in visual-language tasks while we alter the learning objective to cater to the language-heavy characteristics of NLU.
arXiv Detail & Related papers (2022-04-15T03:44:00Z) - Incorporating Linguistic Knowledge for Abstractive Multi-document
Summarization [20.572283625521784]
We develop a neural network based abstractive multi-document summarization (MDS) model.
We process the dependency information into the linguistic-guided attention mechanism.
With the help of linguistic signals, sentence-level relations can be correctly captured.
arXiv Detail & Related papers (2021-09-23T08:13:35Z) - SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation.
Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z) - Control, Generate, Augment: A Scalable Framework for Multi-Attribute
Text Generation [22.70189685469752]
We introduce CGA, a conditional VAE architecture, to control, generate, and augment text.
We show the value of the individual model components in an ablation study.
We show high quality, diversity and attribute control in the generated sentences through a series of automatic and human assessments.
arXiv Detail & Related papers (2020-04-30T17:31:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.