Prompt-Based Length Controlled Generation with Reinforcement Learning
- URL: http://arxiv.org/abs/2308.12030v2
- Date: Sat, 30 Sep 2023 07:54:22 GMT
- Title: Prompt-Based Length Controlled Generation with Reinforcement Learning
- Authors: Renlong Jie, Xiaojun Meng, Lifeng Shang, Xin Jiang, Qun Liu
- Abstract summary: We propose a prompt-based length control method to achieve high-accuracy length controlled generation.
We adopt reinforcement learning with the reward signal given by either trainable or rule-based reward models.
Our method significantly improves the accuracy of prompt-based length control for summarization task on popular datasets like CNNDM and NYT.
- Score: 48.49553921757085
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) like ChatGPT and GPT-4 have attracted great
attention given their surprising performance on a wide range of NLP tasks.
Length controlled generation of LLMs emerges as an important topic, which
enables users to fully leverage the capability of LLMs in more real-world
scenarios like generating a proper answer or essay of a desired length. In
addition, the autoregressive generation in LLMs is extremely time-consuming,
while the ability of controlling this generated length can reduce the inference
cost by limiting the length. Therefore, we propose a prompt-based length
control method to achieve high-accuracy length controlled generation. In
particular, we adopt reinforcement learning with the reward signal given by
either trainable or rule-based reward models, which further enhances the
length-control ability of LLMs by rewarding outputs that follows pre-defined
control instruction. To enable rule-based inference, we also introduce standard
prompt extractor to collect the standard control information from users' input.
Experiments show that our method significantly improves the accuracy of
prompt-based length control for summarization task on popular datasets like
CNNDM and NYT. Both the standard prompt extractor and the RL-tuned model have
show strong generalization ability to unseen control prompt templates.
Related papers
- Zero-Shot Strategies for Length-Controllable Summarization [56.15356055672189]
Large language models (LLMs) struggle with precise length control, particularly in zero-shot settings.
We conduct a comprehensive study evaluating LLMs' length control capabilities across multiple measures and propose practical methods to improve controllability.
Our experiments with LLaMA 3 reveal stark differences in length adherence across measures and highlight inherent biases of the model.
arXiv Detail & Related papers (2024-12-31T02:53:27Z) - Length Controlled Generation for Black-box LLMs [70.57649832433451]
Large language models (LLMs) have demonstrated impressive instruction following capabilities, but struggle to accurately manage the length of generated text.
We propose a novel iterative sampling framework for text length control, integrating the Metropolis-Hastings algorithm with an importance sampling acceleration strategy.
Our framework achieves almost 100% success rates of length control on Llama3.1 for tasks such as length-controlled abstractive summarization.
arXiv Detail & Related papers (2024-12-19T09:07:38Z) - Precise Length Control in Large Language Models [1.3654846342364308]
Large Language Models (LLMs) are increasingly used in production systems.
We propose a method to adapt pre-trained decoder-only LLMs for precise control of response length.
arXiv Detail & Related papers (2024-12-16T16:22:27Z) - Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models [14.175953642749649]
Large language models often struggle to generate responses of a specific length.
We introduce a novel, model-agnostic approach called Ruler to enhance the instruction-following ability of large language models under length-constrained instructions.
arXiv Detail & Related papers (2024-09-27T17:44:58Z) - Prompt-Based Length Controlled Generation with Multiple Control Types [45.202705040391734]
We propose a prompt-based method to achieve length controlled generation under different control types with high accuracy.
In particular, we adopt reinforcement learning (RL) and sample filtering with the reward signal given by rule-based reward models.
Experiments show that our method significantly improves the accuracy of prompt-based length control on popular summarization datasets like CNNDM and NYT.
arXiv Detail & Related papers (2024-06-12T01:49:54Z) - Prompt Highlighter: Interactive Control for Multi-Modal LLMs [50.830448437285355]
This study targets a critical aspect of multi-modal LLMs' (LLMs&VLMs) inference: explicit controllable text generation.
We introduce a novel inference method, Prompt Highlighter, which enables users to highlight specific prompt spans to interactively control the focus during generation.
We find that, during inference, guiding the models with highlighted tokens through the attention weights leads to more desired outputs.
arXiv Detail & Related papers (2023-12-07T13:53:29Z) - Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs.
Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.