Related papers: Prefix-Tuning Based Unsupervised Text Style Transfer

Prefix-Tuning Based Unsupervised Text Style Transfer

URL: http://arxiv.org/abs/2310.14599v1
Date: Mon, 23 Oct 2023 06:13:08 GMT
Title: Prefix-Tuning Based Unsupervised Text Style Transfer
Authors: Huiyu Mai, Wenhao Jiang, Zhihong Deng
Abstract summary: Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content. In this paper, we employ powerful pre-trained large language models and present a new prefix-tuning-based method for unsupervised text style transfer.
Score: 29.86587278794342
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unsupervised text style transfer aims at training a generative model that can alter the style of the input sentence while preserving its content without using any parallel data. In this paper, we employ powerful pre-trained large language models and present a new prefix-tuning-based method for unsupervised text style transfer. We construct three different kinds of prefixes, i.e., \textit{shared prefix, style prefix}, and \textit{content prefix}, to encode task-specific information, target style, and the content information of the input sentence, respectively. Compared to embeddings used by previous works, the proposed prefixes can provide richer information for the model. Furthermore, we adopt a recursive way of using language models in the process of style transfer. This strategy provides a more effective way for the interactions between the input sentence and GPT-2, helps the model construct more informative prefixes, and thus, helps improve the performance. Evaluations on the well-known datasets show that our method outperforms the state-of-the-art baselines. Results, analysis of ablation studies, and subjective evaluations from humans are also provided for a deeper understanding of the proposed method.

Related papers

Improving Neural Topic Modeling with Semantically-Grounded Soft Label Distributions [15.97570754056266]
We propose a novel approach to construct semantically-grounded soft label targets using Language Models (LMs)<n>Our method produces higher-quality topics that are more closely aligned with the underlying thematic structure of the corpus.<n>We also introduce a retrieval-based metric, which shows that our approach significantly outperforms existing methods in identifying semantically similar documents.
arXiv Detail & Related papers (2026-02-20T00:12:04Z)
Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness [3.2925222641796554]
"pointer-guided segment ordering" (SO) is a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations. Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures.
arXiv Detail & Related papers (2024-06-06T15:17:51Z)
Specializing Small Language Models towards Complex Style Transfer via Latent Attribute Pre-Training [29.143887057933327]
We introduce the concept of complex text style transfer tasks, and constructed complex text datasets based on two widely applicable scenarios. Our dataset is the first large-scale data set of its kind, with 700 rephrased sentences and 1,000 sentences from the game Genshin Impact.
arXiv Detail & Related papers (2023-09-19T21:01:40Z)
ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer [57.6482608202409]
Textual style transfer is the task of transforming stylistic properties of text while preserving meaning. We introduce a novel diffusion-based framework for general-purpose style transfer that can be flexibly adapted to arbitrary target styles. We validate the method on the Enron Email Corpus, with both human and automatic evaluations, and find that it outperforms strong baselines on formality, sentiment, and even authorship style transfer.
arXiv Detail & Related papers (2023-08-29T17:36:02Z)
Stylized Data-to-Text Generation: A Case Study in the E-Commerce Domain [53.22419717434372]
We propose a new task, namely stylized data-to-text generation, whose aim is to generate coherent text according to a specific style. This task is non-trivial, due to three challenges: the logic of the generated text, unstructured style reference, and biased training samples. We propose a novel stylized data-to-text generation model, named StyleD2T, comprising three components: logic planning-enhanced data embedding, mask-based style embedding, and unbiased stylized text generation.
arXiv Detail & Related papers (2023-05-05T03:02:41Z)
Prompt-Based Editing for Text Style Transfer [25.863546922455498]
We present a prompt-based editing approach for text style transfer. We transform a prompt-based generation problem into a classification one, which is a training-free process. Our approach largely outperforms the state-of-the-art systems that have 20 times more parameters.
arXiv Detail & Related papers (2023-01-27T21:31:14Z)
Text Revision by On-the-Fly Representation Optimization [76.11035270753757]
Current state-of-the-art methods formulate these tasks as sequence-to-sequence learning problems. We present an iterative in-place editing approach for text revision, which requires no parallel data. It achieves competitive and even better performance than state-of-the-art supervised methods on text simplification.
arXiv Detail & Related papers (2022-04-15T07:38:08Z)
Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting. Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking. We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z)
Grounded Compositional Outputs for Adaptive Language Modeling [59.02706635250856]
A language model's vocabulary$-$typically selected before training and permanently fixed later$-$affects its size. We propose a fully compositional output embedding layer for language models. To our knowledge, the result is the first word-level language model with a size that does not depend on the training vocabulary.
arXiv Detail & Related papers (2020-09-24T07:21:14Z)
Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer [60.07283363509065]
Unsupervised style transfer aims to change the style of an input sentence while preserving its original content. We propose a novel attentional sequence-to-sequence model that exploits the relevance of each output word to the target style. Experimental results show that our proposed model achieves state-of-the-art performance in terms of both transfer accuracy and content preservation.
arXiv Detail & Related papers (2020-05-05T10:24:28Z)
Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer. In detail, the input is a set of structured records and a reference text for describing another recordset. The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.