Dr Genre: Reinforcement Learning from Decoupled LLM Feedback for Generic Text Rewriting
- URL: http://arxiv.org/abs/2503.06781v1
- Date: Sun, 09 Mar 2025 21:23:52 GMT
- Title: Dr Genre: Reinforcement Learning from Decoupled LLM Feedback for Generic Text Rewriting
- Authors: Yufei Li, John Nham, Ganesh Jawahar, Lei Shu, David Uthus, Yun-Hsuan Sung, Chengrun Yang, Itai Rolnick, Yi Qiao, Cong Liu,
- Abstract summary: We introduce a generic model proficient in factuality, stylistic, and conversational rewriting tasks.<n>To simulate real-world user rewrite requests, we construct a conversational rewrite dataset, ChatRewrite, that presents natural''-sounding instructions.<n>To align with task-specific objectives, we propose Dr Genre, a Decoupled-reward learning framework for Generic rewriting.
- Score: 15.796381427671681
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Generic text rewriting is a prevalent large language model (LLM) application that covers diverse real-world tasks, such as style transfer, fact correction, and email editing. These tasks vary in rewriting objectives (e.g., factual consistency vs. semantic preservation), making it challenging to develop a unified model that excels across all dimensions. Existing methods often specialize in either a single task or a specific objective, limiting their generalizability. In this work, we introduce a generic model proficient in factuality, stylistic, and conversational rewriting tasks. To simulate real-world user rewrite requests, we construct a conversational rewrite dataset, ChatRewrite, that presents ``natural''-sounding instructions, from raw emails using LLMs. Combined with other popular rewrite datasets, including LongFact for the factuality rewrite task and RewriteLM for the stylistic rewrite task, this forms a broad benchmark for training and evaluating generic rewrite models. To align with task-specific objectives, we propose Dr Genre, a Decoupled-reward learning framework for Generic rewriting, that utilizes objective-oriented reward models with a task-specific weighting. Evaluation shows that \approach delivers higher-quality rewrites across all targeted tasks, improving objectives including instruction following (agreement), internal consistency (coherence), and minimal unnecessary edits (conciseness).
Related papers
- RaFe: Ranking Feedback Improves Query Rewriting for RAG [83.24385658573198]
We propose a framework for training query rewriting models free of annotations.
By leveraging a publicly available reranker, oursprovides feedback aligned well with the rewriting objectives.
arXiv Detail & Related papers (2024-05-23T11:00:19Z) - Eliciting Human Preferences with Language Models [56.68637202313052]
Language models (LMs) can be directed to perform target tasks by using labeled examples or natural language prompts.
We propose to use *LMs themselves* to guide the task specification process.
We study GATE in three domains: email validation, content recommendation, and moral reasoning.
arXiv Detail & Related papers (2023-10-17T21:11:21Z) - Enhancing Conversational Search: Large Language Model-Aided Informative
Query Rewriting [42.35788605017555]
We propose utilizing large language models (LLMs) as query rewriters.
We define four essential properties for well-formed rewrites and incorporate all of them into the instruction.
We introduce the role of rewrite editors for LLMs when initial query rewrites are available, forming a "rewrite-then-edit" process.
arXiv Detail & Related papers (2023-10-15T03:04:17Z) - Interactive Editing for Text Summarization [30.46273082913698]
REVISE is a framework designed to facilitate iterative editing and refinement of draft summaries by human writers.
At its core, REVISE incorporates a modified fill-in-the-middle model with the encoder-decoder architecture.
arXiv Detail & Related papers (2023-06-05T17:43:53Z) - RewriteLM: An Instruction-Tuned Large Language Model for Text Rewriting [11.306772273707253]
Large Language Models (LLMs) have demonstrated impressive capabilities in creative tasks such as storytelling and E-mail generation.
We develop new strategies for instruction tuning and reinforcement learning to better align LLMs for cross-sentence rewriting tasks.
OpenRewriteEval, a novel benchmark covers a wide variety of rewriting types expressed through natural language instructions.
arXiv Detail & Related papers (2023-05-25T03:26:26Z) - Improving Cross-task Generalization of Unified Table-to-text Models with
Compositional Task Configurations [63.04466647849211]
Methods typically encode task information with a simple dataset name as a prefix to the encoder.
We propose compositional task configurations, a set of prompts prepended to the encoder to improve cross-task generalization.
We show this not only allows the model to better learn shared knowledge across different tasks at training, but also allows us to control the model by composing new configurations.
arXiv Detail & Related papers (2022-12-17T02:20:14Z) - PEER: A Collaborative Language Model [70.11876901409906]
We introduce PEER, a collaborative language model that imitates the entire writing process itself.
PEER can write drafts, add suggestions, propose edits and provide explanations for its actions.
We show that PEER achieves strong performance across various domains and editing tasks.
arXiv Detail & Related papers (2022-08-24T16:56:47Z) - Letter-level Online Writer Identification [86.13203975836556]
We focus on a novel problem, letter-level online writer-id, which requires only a few trajectories of written letters as identification cues.
A main challenge is that a person often writes a letter in different styles from time to time.
We refer to this problem as the variance of online writing styles (Var-O-Styles)
arXiv Detail & Related papers (2021-12-06T07:21:53Z) - Substance over Style: Document-Level Targeted Content Transfer [42.18770674148932]
We introduce the task of document-level targeted content transfer and address it in the recipe domain.
We propose a novel model for this task based on the generative pre-trained language model (GPT-2)
Both automatic and human evaluations show that our model out-performs existing methods.
arXiv Detail & Related papers (2020-10-16T20:26:10Z) - Pre-training via Paraphrasing [96.79972492585112]
We introduce MARGE, a pre-trained sequence-to-sequence model learned with an unsupervised multi-lingual paraphrasing objective.
We show it is possible to jointly learn to do retrieval and reconstruction, given only a random initialization.
For example, with no additional task-specific training we achieve BLEU scores of up to 35.8 for document translation.
arXiv Detail & Related papers (2020-06-26T14:43:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.