Learning to Generate Text in Arbitrary Writing Styles
- URL: http://arxiv.org/abs/2312.17242v2
- Date: Mon, 4 Mar 2024 16:00:23 GMT
- Title: Learning to Generate Text in Arbitrary Writing Styles
- Authors: Aleem Khan, Andrew Wang, Sophia Hager, Nicholas Andrews
- Abstract summary: It is desirable for language models to produce text in an author-specific style on the basis of a potentially small writing sample.
We propose to guide a language model to generate text in a target style using contrastively-trained representations that capture stylometric features.
- Score: 6.7308816341849695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prior work in style-controlled text generation has focused on tasks such as
emulating the style of prolific literary authors, producing formal or informal
text, and mitigating toxicity of generated text. Plentiful demonstrations of
these styles are available, and as a result modern language models are often
able to emulate them, either via prompting or discriminative control. However,
in applications such as writing assistants, it is desirable for language models
to produce text in an author-specific style on the basis of a potentially small
writing sample. For example, someone writing in a particular dialect may prefer
writing suggestions that retain the same dialect. We find that
instruction-tuned language models can struggle to reproduce author-specific
style demonstrated in a prompt. Instead, we propose to guide a language model
to generate text in a target style using contrastively-trained representations
that capture stylometric features. Our approach (StyleMC) combines an
author-adapted language model with sequence-level inference to improve
stylistic consistency, and is found to be effective in a variety of conditions,
including unconditional generation and style transfer. Additionally, we find
that the proposed approach can serve as an effective anonymization method, by
editing a document to mask authorship while preserving the original meaning
Related papers
- Capturing Style in Author and Document Representation [4.323709559692927]
We propose a new architecture that learns embeddings for both authors and documents with a stylistic constraint.
We evaluate our method on three datasets: a literary corpus extracted from the Gutenberg Project, the Blog Authorship and IMDb62.
arXiv Detail & Related papers (2024-07-18T10:01:09Z) - TinyStyler: Efficient Few-Shot Text Style Transfer with Authorship Embeddings [51.30454130214374]
We introduce TinyStyler, a lightweight but effective approach to perform efficient, few-shot text style transfer.
We evaluate TinyStyler's ability to perform text attribute style transfer with automatic and human evaluations.
Our model has been made publicly available at https://huggingface.co/tinystyler/tinystyler.
arXiv Detail & Related papers (2024-06-21T18:41:22Z) - ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style
Transfer [57.6482608202409]
Textual style transfer is the task of transforming stylistic properties of text while preserving meaning.
We introduce a novel diffusion-based framework for general-purpose style transfer that can be flexibly adapted to arbitrary target styles.
We validate the method on the Enron Email Corpus, with both human and automatic evaluations, and find that it outperforms strong baselines on formality, sentiment, and even authorship style transfer.
arXiv Detail & Related papers (2023-08-29T17:36:02Z) - Visual Captioning at Will: Describing Images and Videos Guided by a Few
Stylized Sentences [49.66987347397398]
Few-Shot Stylized Visual Captioning aims to generate captions in any desired style, using only a few examples as guidance during inference.
We propose a framework called FS-StyleCap for this task, which utilizes a conditional encoder-decoder language model and a visual projection module.
arXiv Detail & Related papers (2023-07-31T04:26:01Z) - WordStylist: Styled Verbatim Handwritten Text Generation with Latent
Diffusion Models [8.334487584550185]
We present a latent diffusion-based method for styled text-to-text-content-image generation on word-level.
Our proposed method is able to generate realistic word image samples from different writer styles.
We show that the proposed model produces samples that are aesthetically pleasing, help boosting text recognition performance, and get similar writer retrieval score as real data.
arXiv Detail & Related papers (2023-03-29T10:19:26Z) - Handwritten Text Generation from Visual Archetypes [25.951540903019467]
We devise a Transformer-based model for Few-Shot styled handwritten text generation.
We obtain a robust representation of unseen writers' calligraphy by exploiting specific pre-training on a large synthetic dataset.
arXiv Detail & Related papers (2023-03-27T14:58:20Z) - StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized
Tokenizer of a Large-Scale Generative Model [64.26721402514957]
We propose StylerDALLE, a style transfer method that uses natural language to describe abstract art styles.
Specifically, we formulate the language-guided style transfer task as a non-autoregressive token sequence translation.
To incorporate style information, we propose a Reinforcement Learning strategy with CLIP-based language supervision.
arXiv Detail & Related papers (2023-03-16T12:44:44Z) - StoryTrans: Non-Parallel Story Author-Style Transfer with Discourse
Representations and Content Enhancing [73.81778485157234]
Long texts usually involve more complicated author linguistic preferences such as discourse structures than sentences.
We formulate the task of non-parallel story author-style transfer, which requires transferring an input story into a specified author style.
We use an additional training objective to disentangle stylistic features from the learned discourse representation to prevent the model from degenerating to an auto-encoder.
arXiv Detail & Related papers (2022-08-29T08:47:49Z) - Incorporating Stylistic Lexical Preferences in Generative Language
Models [10.62343151429147]
We present an approach to induce certain target-author attributes by incorporating continuous multi-dimensional lexical preferences of an author into generative language models.
Our experiments demonstrate that the proposed approach can generate text that distinctively aligns with a given target author's lexical style.
arXiv Detail & Related papers (2020-10-22T09:24:05Z) - Stylized Dialogue Response Generation Using Stylized Unpaired Texts [63.69880979112312]
This paper proposes a stylized dialogue generation method that can capture stylistic features embedded in unpaired texts.
Our method can produce dialogue responses that are both coherent to the given context and conform to the target style.
arXiv Detail & Related papers (2020-09-27T01:04:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.