Related papers: Personalized Text Generation with Fine-Grained Linguistic Control

Personalized Text Generation with Fine-Grained Linguistic Control

URL: http://arxiv.org/abs/2402.04914v1
Date: Wed, 7 Feb 2024 14:41:08 GMT
Title: Personalized Text Generation with Fine-Grained Linguistic Control
Authors: Bashar Alhafni, Vivek Kulkarni, Dhruv Kumar, Vipul Raheja
Abstract summary: We focus on controlling fine-grained attributes spanning multiple linguistic dimensions. We introduce a novel benchmark to train generative models and evaluate their ability to generate personalized text.
Score: 9.668216418094316
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As the text generation capabilities of large language models become increasingly prominent, recent studies have focused on controlling particular aspects of the generated text to make it more personalized. However, most research on controllable text generation focuses on controlling the content or modeling specific high-level/coarse-grained attributes that reflect authors' writing styles, such as formality, domain, or sentiment. In this paper, we focus on controlling fine-grained attributes spanning multiple linguistic dimensions, such as lexical and syntactic attributes. We introduce a novel benchmark to train generative models and evaluate their ability to generate personalized text based on multiple fine-grained linguistic attributes. We systematically investigate the performance of various large language models on our benchmark and draw insights from the factors that impact their performance. We make our code, data, and pretrained models publicly available.

Related papers

Beyond Words: Advancing Long-Text Image Generation via Multimodal Autoregressive Models [76.68654868991517]
Long-form text in images, such as paragraphs in slides or documents, remains a major challenge for current generative models. We introduce a novel text-focused, binary tokenizer optimized for capturing detailed scene text features. We develop ModelName, a multimodal autoregressive model that excels in generating high-quality long-text images with unprecedented fidelity.
arXiv Detail & Related papers (2025-03-26T03:44:25Z)
Language Models for Text Classification: Is In-Context Learning Enough? [54.869097980761595]
Recent foundational language models have shown state-of-the-art performance in many NLP tasks in zero- and few-shot settings. An advantage of these models over more standard approaches is the ability to understand instructions written in natural language (prompts) This makes them suitable for addressing text classification problems for domains with limited amounts of annotated instances.
arXiv Detail & Related papers (2024-03-26T12:47:39Z)
Retrieval is Accurate Generation [99.24267226311157]
We introduce a novel method that selects context-aware phrases from a collection of supporting documents. Our model achieves the best performance and the lowest latency among several retrieval-augmented baselines.
arXiv Detail & Related papers (2024-02-27T14:16:19Z)
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering [118.30923824681642]
TextDiffuser-2 aims to unleash the power of language models for text rendering. We utilize the language model within the diffusion model to encode the position and texts at the line level. We conduct extensive experiments and incorporate user studies involving human participants as well as GPT-4V.
arXiv Detail & Related papers (2023-11-28T04:02:40Z)
GujiBERT and GujiGPT: Construction of Intelligent Information Processing Foundation Language Models for Ancient Texts [11.289265479095956]
GujiBERT and GujiGPT language models are foundational models specifically designed for intelligent information processing of ancient texts. These models have been trained on an extensive dataset that encompasses both simplified and traditional Chinese characters. These models have exhibited exceptional performance across a range of validation tasks using publicly available datasets.
arXiv Detail & Related papers (2023-07-11T15:44:01Z)
Controllable Text Generation for Open-Domain Creativity and Fairness [36.744208990024575]
I introduce our recent works on controllable text generation to enhance the creativity and fairness of language generation models. We explore hierarchical generation and constrained decoding, with applications to creative language generation including story, poetry, and figurative languages.
arXiv Detail & Related papers (2022-09-24T22:40:01Z)
RSTGen: Imbuing Fine-Grained Interpretable Control into Long-FormText Generators [26.27412809287025]
RSTGen is a framework that controls the discourse structure, semantics and topics of generated text. We demonstrate our model's ability to control structural discourse and semantic features of generated text in open generation evaluation.
arXiv Detail & Related papers (2022-05-25T09:06:04Z)
How much do language models copy from their training data? Evaluating linguistic novelty in text generation using RAVEN [63.79300884115027]
Current language models can generate high-quality text. Are they simply copying text they have seen before, or have they learned generalizable linguistic abstractions? We introduce RAVEN, a suite of analyses for assessing the novelty of generated text.
arXiv Detail & Related papers (2021-11-18T04:07:09Z)
Attribute Alignment: Controlling Text Generation from Pre-trained Language Models [46.19190007510232]
We propose a simple and flexible method for controlling text generation by aligning disentangled attribute representations. In contrast to recent efforts on training a discriminator to perturb the token level distribution for an attribute, we use the same data to learn an alignment function to guide the pre-trained, non-controlled language model to generate texts with the target attribute without changing the original language model parameters.
arXiv Detail & Related papers (2021-03-20T01:51:32Z)
Incorporating Stylistic Lexical Preferences in Generative Language Models [10.62343151429147]
We present an approach to induce certain target-author attributes by incorporating continuous multi-dimensional lexical preferences of an author into generative language models. Our experiments demonstrate that the proposed approach can generate text that distinctively aligns with a given target author's lexical style.
arXiv Detail & Related papers (2020-10-22T09:24:05Z)
Exemplar-Controllable Paraphrasing and Translation using Bitext [57.92051459102902]
We adapt models from prior work to be able to learn solely from bilingual text (bitext) Our single proposed model can perform four tasks: controlled paraphrase generation in both languages and controlled machine translation in both language directions.
arXiv Detail & Related papers (2020-10-12T17:02:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.