Unsupervised Text Style Transfer via LLMs and Attention Masking with
Multi-way Interactions
- URL: http://arxiv.org/abs/2402.13647v1
- Date: Wed, 21 Feb 2024 09:28:02 GMT
- Title: Unsupervised Text Style Transfer via LLMs and Attention Masking with
Multi-way Interactions
- Authors: Lei Pan, Yunshi Lan, Yang Li, Weining Qian
- Abstract summary: Unsupervised Text Style Transfer (UTST) has emerged as a critical task within the domain of Natural Language Processing (NLP)
We propose four ways of interactions, that are pipeline framework with tuned orders; knowledge distillation from Large Language Models (LLMs) to attention masking model; in-context learning with constructed parallel examples.
We empirically show these multi-way interactions can improve the baselines in certain perspective of style strength, content preservation and text fluency.
- Score: 18.64326057581588
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised Text Style Transfer (UTST) has emerged as a critical task within
the domain of Natural Language Processing (NLP), aiming to transfer one
stylistic aspect of a sentence into another style without changing its
semantics, syntax, or other attributes. This task is especially challenging
given the intrinsic lack of parallel text pairings. Among existing methods for
UTST tasks, attention masking approach and Large Language Models (LLMs) are
deemed as two pioneering methods. However, they have shortcomings in generating
unsmooth sentences and changing the original contents, respectively. In this
paper, we investigate if we can combine these two methods effectively. We
propose four ways of interactions, that are pipeline framework with tuned
orders; knowledge distillation from LLMs to attention masking model; in-context
learning with constructed parallel examples. We empirically show these
multi-way interactions can improve the baselines in certain perspective of
style strength, content preservation and text fluency. Experiments also
demonstrate that simply conducting prompting followed by attention
masking-based revision can consistently surpass the other systems, including
supervised text style transfer systems. On Yelp-clean and Amazon-clean
datasets, it improves the previously best mean metric by 0.5 and 3.0 absolute
percentages respectively, and achieves new SOTA results.
Related papers
- Mixture of Prompt Learning for Vision Language Models [12.828490399811376]
We propose a mixture of soft prompt learning method incorporating a routing module.
This module is able to capture a dataset's varied styles and dynamically selects the most suitable prompts for each instance.
We also implement semantically grouped text-level supervision, initializing each soft prompt with the token embeddings of manually designed templates from its group.
arXiv Detail & Related papers (2024-09-18T14:25:02Z) - Advancing Prompt Learning through an External Layer [24.77977865016954]
We propose a paradigm called EnPrompt with a novel External Layer (EnLa)
The learnable external layer is built upon valid embeddings of pre-trained CLIP.
Four experiments demonstrate that our method outperforms the existing prompt learning method.
arXiv Detail & Related papers (2024-07-29T03:30:09Z) - Successor Features for Efficient Multisubject Controlled Text Generation [48.37713738712319]
We introduce SF-GEN, which is grounded in two primary concepts: successor features (SFs) and language model rectification.
SF-GEN seamlessly integrates the two to enable dynamic steering of text generation with no need to alter the LLM's parameters.
To the best of our knowledge, our research represents the first application of successor features in text generation.
arXiv Detail & Related papers (2023-11-03T00:17:08Z) - Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard
Parameter Sharing [72.56219471145232]
We propose a ST/MT multi-tasking framework with hard parameter sharing.
Our method reduces the speech-text modality gap via a pre-processing stage.
We show that our framework improves attentional encoder-decoder, Connectionist Temporal Classification (CTC), transducer, and joint CTC/attention models by an average of +0.5 BLEU.
arXiv Detail & Related papers (2023-09-27T17:48:14Z) - MaPLe: Multi-modal Prompt Learning [54.96069171726668]
We propose Multi-modal Prompt Learning (MaPLe) for both vision and language branches to improve alignment between the vision and language representations.
Compared with the state-of-the-art method Co-CoOp, MaPLe exhibits favorable performance and achieves an absolute gain of 3.45% on novel classes.
arXiv Detail & Related papers (2022-10-06T17:59:56Z) - Text Revision by On-the-Fly Representation Optimization [76.11035270753757]
Current state-of-the-art methods formulate these tasks as sequence-to-sequence learning problems.
We present an iterative in-place editing approach for text revision, which requires no parallel data.
It achieves competitive and even better performance than state-of-the-art supervised methods on text simplification.
arXiv Detail & Related papers (2022-04-15T07:38:08Z) - Contextual Text Style Transfer [73.66285813595616]
Contextual Text Style Transfer aims to translate a sentence into a desired style with its surrounding context taken into account.
We propose a Context-Aware Style Transfer (CAST) model, which uses two separate encoders for each input sentence and its surrounding context.
Two new benchmarks, Enron-Context and Reddit-Context, are introduced for formality and offensiveness style transfer.
arXiv Detail & Related papers (2020-04-30T23:01:12Z) - Learning to Select Bi-Aspect Information for Document-Scale Text Content
Manipulation [50.01708049531156]
We focus on a new practical task, document-scale text content manipulation, which is the opposite of text style transfer.
In detail, the input is a set of structured records and a reference text for describing another recordset.
The output is a summary that accurately describes the partial content in the source recordset with the same writing style of the reference.
arXiv Detail & Related papers (2020-02-24T12:52:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.