Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models
- URL: http://arxiv.org/abs/2409.18943v2
- Date: Tue, 1 Oct 2024 09:20:58 GMT
- Title: Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models
- Authors: Jiaming Li, Lei Zhang, Yunshui Li, Ziqiang Liu, yuelin bai, Run Luo, Longze Chen, Min Yang,
- Abstract summary: Large language models often struggle to generate responses of a specific length.
We introduce a novel, model-agnostic approach called Ruler to enhance the instruction-following ability of large language models under length-constrained instructions.
- Score: 14.175953642749649
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The instruction-following ability of large language models enables humans to interact with AI agents in a natural way. However, when required to generate responses of a specific length, large language models often struggle to meet users' needs due to their inherent difficulty in accurately perceiving numerical constraints. To explore the ability of large language models to control the length of generated responses, we propose the Target Length Generation Task (TLG) and design two metrics, Precise Match (PM) and Flexible Match (FM) to evaluate the model's performance in adhering to specified response lengths. Furthermore, we introduce a novel, model-agnostic approach called Ruler, which employs Meta Length Tokens (MLTs) to enhance the instruction-following ability of large language models under length-constrained instructions. Specifically, Ruler equips LLMs with the ability to generate responses of a specified length based on length constraints within the instructions. Moreover, Ruler can automatically generate appropriate MLT when length constraints are not explicitly provided, demonstrating excellent versatility and generalization. Comprehensive experiments show the effectiveness of Ruler across different LLMs on Target Length Generation Task, e.g., at All Level 27.97 average gain on PM, 29.57 average gain on FM. In addition, we conduct extensive ablation experiments to further substantiate the efficacy and generalization of Ruler. Our code and data is available at https://github.com/Geaming2002/Ruler.
Related papers
- Disentangling Length Bias In Preference Learning Via Response-Conditioned Modeling [87.17041933863041]
We introduce a Response-conditioned Bradley-Terry (Rc-BT) model that enhances the reward model's capability in length bias mitigating and length instruction following.
We also propose the Rc-DPO algorithm to leverage the Rc-BT model for direct policy optimization (DPO) of large language models.
arXiv Detail & Related papers (2025-02-02T14:50:25Z) - Length Controlled Generation for Black-box LLMs [70.57649832433451]
Large language models (LLMs) have demonstrated impressive instruction following capabilities, but struggle to accurately manage the length of generated text.
We propose a novel iterative sampling framework for text length control, integrating the Metropolis-Hastings algorithm with an importance sampling acceleration strategy.
Our framework achieves almost 100% success rates of length control on Llama3.1 for tasks such as length-controlled abstractive summarization.
arXiv Detail & Related papers (2024-12-19T09:07:38Z) - Precise Length Control in Large Language Models [1.3654846342364308]
Large Language Models (LLMs) are increasingly used in production systems.
We propose a method to adapt pre-trained decoder-only LLMs for precise control of response length.
arXiv Detail & Related papers (2024-12-16T16:22:27Z) - Language Models can Self-Lengthen to Generate Long Texts [74.96074422345806]
This paper introduces an innovative iterative training framework called Self-Lengthen.
It leverages only the intrinsic knowledge and skills of Large Language Models without the need for auxiliary data or proprietary models.
Experiments on benchmarks and human evaluations show that Self-Lengthen outperforms existing methods in long-text generation.
arXiv Detail & Related papers (2024-10-31T13:47:10Z) - LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs [4.4965596747053]
LongGenBench is a novel benchmark designed to rigorously evaluate large language models' ability to generate long text.
It evaluates model performance across four distinct scenarios, three instruction types, and two generation-lengths (16K and 32K tokens)
Our evaluation of ten state-of-the-art LLMs reveals that, despite strong results on Ruler, all models struggled with long text generation on LongGenBench.
arXiv Detail & Related papers (2024-09-03T17:25:54Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - InstructCMP: Length Control in Sentence Compression through Instruction-based Large Language Models [27.26285945442178]
InstructCMP is an approach to the sentence compression task that can consider the length constraint through instructions.
We show that applying the length priming significantly improves performances of InstructCMP in both zero-shot and fine-tuning settings.
arXiv Detail & Related papers (2024-06-16T23:00:47Z) - Prompt-Based Length Controlled Generation with Reinforcement Learning [48.49553921757085]
We propose a prompt-based length control method to achieve high-accuracy length controlled generation.
We adopt reinforcement learning with the reward signal given by either trainable or rule-based reward models.
Our method significantly improves the accuracy of prompt-based length control for summarization task on popular datasets like CNNDM and NYT.
arXiv Detail & Related papers (2023-08-23T09:43:10Z) - Augmented Language Models: a Survey [55.965967655575454]
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools.
We refer to them as Augmented Language Models (ALMs)
The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks.
arXiv Detail & Related papers (2023-02-15T18:25:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.