Prompt-Based Length Controlled Generation with Reinforcement Learning
- URL: http://arxiv.org/abs/2308.12030v2
- Date: Sat, 30 Sep 2023 07:54:22 GMT
- Title: Prompt-Based Length Controlled Generation with Reinforcement Learning
- Authors: Renlong Jie, Xiaojun Meng, Lifeng Shang, Xin Jiang, Qun Liu
- Abstract summary: We propose a prompt-based length control method to achieve high-accuracy length controlled generation.
We adopt reinforcement learning with the reward signal given by either trainable or rule-based reward models.
Our method significantly improves the accuracy of prompt-based length control for summarization task on popular datasets like CNNDM and NYT.
- Score: 48.49553921757085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) like ChatGPT and GPT-4 have attracted great
attention given their surprising performance on a wide range of NLP tasks.
Length controlled generation of LLMs emerges as an important topic, which
enables users to fully leverage the capability of LLMs in more real-world
scenarios like generating a proper answer or essay of a desired length. In
addition, the autoregressive generation in LLMs is extremely time-consuming,
while the ability of controlling this generated length can reduce the inference
cost by limiting the length. Therefore, we propose a prompt-based length
control method to achieve high-accuracy length controlled generation. In
particular, we adopt reinforcement learning with the reward signal given by
either trainable or rule-based reward models, which further enhances the
length-control ability of LLMs by rewarding outputs that follows pre-defined
control instruction. To enable rule-based inference, we also introduce standard
prompt extractor to collect the standard control information from users' input.
Experiments show that our method significantly improves the accuracy of
prompt-based length control for summarization task on popular datasets like
CNNDM and NYT. Both the standard prompt extractor and the RL-tuned model have
show strong generalization ability to unseen control prompt templates.
Related papers
- Adaptable Logical Control for Large Language Models [68.27725600175013]
Ctrl-G is an adaptable framework that facilitates tractable and flexible control of model generation at inference time.
We show that Ctrl-G, when applied to a TULU2-7B model, outperforms GPT3.5 and GPT4 on the task of interactive text editing.
arXiv Detail & Related papers (2024-06-19T23:47:59Z) - InstructCMP: Length Control in Sentence Compression through Instruction-based Large Language Models [27.26285945442178]
InstructCMP is an approach to the sentence compression task that can consider the length constraint through instructions.
We show that applying the length priming significantly improves performances of InstructCMP in both zero-shot and fine-tuning settings.
arXiv Detail & Related papers (2024-06-16T23:00:47Z) - Prompt-Based Length Controlled Generation with Multiple Control Types [45.202705040391734]
We propose a prompt-based method to achieve length controlled generation under different control types with high accuracy.
In particular, we adopt reinforcement learning (RL) and sample filtering with the reward signal given by rule-based reward models.
Experiments show that our method significantly improves the accuracy of prompt-based length control on popular summarization datasets like CNNDM and NYT.
arXiv Detail & Related papers (2024-06-12T01:49:54Z) - One Token Can Help! Learning Scalable and Pluggable Virtual Tokens for Retrieval-Augmented Large Language Models [67.49462724595445]
Retrieval-augmented generation (RAG) is a promising way to improve large language models (LLMs)
We propose a novel method that involves learning scalable and pluggable virtual tokens for RAG.
arXiv Detail & Related papers (2024-05-30T03:44:54Z) - Prompt Highlighter: Interactive Control for Multi-Modal LLMs [50.830448437285355]
This study targets a critical aspect of multi-modal LLMs' (LLMs&VLMs) inference: explicit controllable text generation.
We introduce a novel inference method, Prompt Highlighter, which enables users to highlight specific prompt spans to interactively control the focus during generation.
We find that, during inference, guiding the models with highlighted tokens through the attention weights leads to more desired outputs.
arXiv Detail & Related papers (2023-12-07T13:53:29Z) - Integrating Summarization and Retrieval for Enhanced Personalization via
Large Language Models [11.950478880423733]
Personalization is an essential factor in user experience with natural language processing (NLP) systems.
With the emergence of Large Language Models (LLMs), a key question is how to leverage these models to better personalize user experiences.
We propose a novel summary-augmented personalization with task-aware user summaries generated by LLMs.
arXiv Detail & Related papers (2023-10-30T23:40:41Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.