Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?
- URL: http://arxiv.org/abs/2212.08986v3
- Date: Mon, 04 Nov 2024 18:53:29 GMT
- Title: Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?
- Authors: Ajay Patel, Nicholas Andrews, Chris Callison-Burch,
- Abstract summary: Authorship style transfer involves altering text to match the style of a target author whilst preserving the original meaning.
We introduce the low-resource authorship style transfer task, where only a limited amount of text in the target author's style may exist.
In our experiments, we specifically choose source and target authors from Reddit and style transfer their Reddit posts, limiting ourselves to just 16 posts (on average 500 words) of the target author's style.
- Score: 41.365967145680116
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Authorship style transfer involves altering text to match the style of a target author whilst preserving the original meaning. Existing unsupervised approaches like STRAP have largely focused on style transfer to target authors with many examples of their writing style in books, speeches, or other published works. This high-resource training data requirement (often greater than 100,000 words) makes these approaches primarily useful for style transfer to published authors, politicians, or other well-known figures and authorship styles, while style transfer to non-famous authors has not been well-studied. We introduce the low-resource authorship style transfer task, a more challenging class of authorship style transfer where only a limited amount of text in the target author's style may exist. In our experiments, we specifically choose source and target authors from Reddit and style transfer their Reddit posts, limiting ourselves to just 16 posts (on average ~500 words) of the target author's style. Style transfer accuracy is typically measured by how often a classifier or human judge will classify an output as written by the target author. Recent authorship representations models excel at authorship identification even with just a few writing samples, making automatic evaluation of this task possible for the first time through evaluation metrics we propose. Our results establish an in-context learning technique we develop as the strongest baseline, though we find current approaches do not yet achieve mastery of this challenging task. We release our data and implementations to encourage further investigation.
Related papers
- TinyStyler: Efficient Few-Shot Text Style Transfer with Authorship Embeddings [51.30454130214374]
We introduce TinyStyler, a lightweight but effective approach to perform efficient, few-shot text style transfer.
We evaluate TinyStyler's ability to perform text attribute style transfer with automatic and human evaluations.
Our model has been made publicly available at https://huggingface.co/tinystyler/tinystyler.
arXiv Detail & Related papers (2024-06-21T18:41:22Z) - Authorship Style Transfer with Policy Optimization [26.34892894935038]
Authorship style transfer aims to rewrite a given text into a specified target while preserving the original meaning in the source.
Existing approaches rely on the availability of a large number of target style exemplars for model training.
arXiv Detail & Related papers (2024-03-12T19:34:54Z) - STEER: Unified Style Transfer with Expert Reinforcement [71.3995732115262]
STEER: Unified Style Transfer with Expert Reinforcement, is a unified frame-work developed to overcome the challenge of limited parallel data for style transfer.
We show STEER is robust, maintaining its style transfer capabilities on out-of-domain data, and surpassing nearly all baselines across various styles.
arXiv Detail & Related papers (2023-11-13T09:02:30Z) - ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style
Transfer [57.6482608202409]
Textual style transfer is the task of transforming stylistic properties of text while preserving meaning.
We introduce a novel diffusion-based framework for general-purpose style transfer that can be flexibly adapted to arbitrary target styles.
We validate the method on the Enron Email Corpus, with both human and automatic evaluations, and find that it outperforms strong baselines on formality, sentiment, and even authorship style transfer.
arXiv Detail & Related papers (2023-08-29T17:36:02Z) - Can Authorship Representation Learning Capture Stylistic Features? [5.812943049068866]
We show that representations learned for a surrogate authorship prediction task are indeed sensitive to writing style.
As a consequence, authorship representations may be expected to be robust to certain kinds of data shift, such as topic drift over time.
Our findings may open the door to downstream applications that require stylistic representations, such as style transfer.
arXiv Detail & Related papers (2023-08-22T15:10:45Z) - PART: Pre-trained Authorship Representation Transformer [64.78260098263489]
Authors writing documents imprint identifying information within their texts: vocabulary, registry, punctuation, misspellings, or even emoji usage.
Previous works use hand-crafted features or classification tasks to train their authorship models, leading to poor performance on out-of-domain authors.
We propose a contrastively trained model fit to learn textbfauthorship embeddings instead of semantics.
arXiv Detail & Related papers (2022-09-30T11:08:39Z) - From Theories on Styles to their Transfer in Text: Bridging the Gap with
a Hierarchical Survey [10.822011920177408]
Style transfer aims at re-writing existing texts and creating paraphrases that exhibit desired stylistic attributes.
A handful of surveys give a methodological overview of the field, but they do not support researchers to focus on specific styles.
We organize them into a hierarchy, highlighting the challenges for the definition of each of them, and pointing out gaps in the current research landscape.
arXiv Detail & Related papers (2021-10-29T15:53:06Z) - Few-shot Controllable Style Transfer for Low-Resource Settings: A Study
in Indian Languages [13.980482277351523]
Style transfer is the task of rewriting an input sentence into a target style while preserving its content.
We push the state-of-the-art for few-shot style transfer with a new method modeling the stylistic difference between paraphrases.
Our model achieves 2-3x better performance and output diversity in formality transfer and code-mixing addition across five Indian languages.
arXiv Detail & Related papers (2021-10-14T14:16:39Z) - DeepStyle: User Style Embedding for Authorship Attribution of Short
Texts [57.503904346336384]
Authorship attribution (AA) is an important and widely studied research topic with many applications.
Recent works have shown that deep learning methods could achieve significant accuracy improvement for the AA task.
We propose DeepStyle, a novel embedding-based framework that learns the representations of users' salient writing styles.
arXiv Detail & Related papers (2021-03-14T15:56:37Z) - DRAG: Director-Generator Language Modelling Framework for Non-Parallel
Author Stylized Rewriting [9.275464023441227]
Author stylized rewriting is the task of rewriting an input text in a particular author's style.
We propose a Director-Generator framework to rewrite content in the target author's style.
arXiv Detail & Related papers (2021-01-28T06:52:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.