Related papers: Precise Length Control in Large Language Models

Precise Length Control in Large Language Models

URL: http://arxiv.org/abs/2412.11937v1
Date: Mon, 16 Dec 2024 16:22:27 GMT
Title: Precise Length Control in Large Language Models
Authors: Bradley Butcher, Michael O'Keefe, James Titchener,
Abstract summary: Large Language Models (LLMs) are increasingly used in production systems.<n>We propose a method to adapt pre-trained decoder-only LLMs for precise control of response length.
Score: 1.3654846342364308
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large Language Models (LLMs) are increasingly used in production systems, powering applications such as chatbots, summarization, and question answering. Despite their success, controlling the length of their response remains a significant challenge, particularly for tasks requiring structured outputs or specific levels of detail. In this work, we propose a method to adapt pre-trained decoder-only LLMs for precise control of response length. Our approach incorporates a secondary length-difference positional encoding (LDPE) into the input embeddings, which counts down to a user-set response termination length. Fine-tuning with LDPE allows the model to learn to terminate responses coherently at the desired length, achieving mean token errors of less than 3 tokens. We also introduce Max New Tokens++, an extension that enables flexible upper-bound length control, rather than an exact target. Experimental results on tasks such as question answering and document summarization demonstrate that our method enables precise length control without compromising response quality.

Related papers

Zero-Shot Strategies for Length-Controllable Summarization [56.15356055672189]
Large language models (LLMs) struggle with precise length control, particularly in zero-shot settings. We conduct a comprehensive study evaluating LLMs' length control capabilities across multiple measures and propose practical methods to improve controllability. Our experiments with LLaMA 3 reveal stark differences in length adherence across measures and highlight inherent biases of the model.
arXiv Detail & Related papers (2024-12-31T02:53:27Z)
Length Controlled Generation for Black-box LLMs [70.57649832433451]
Large language models (LLMs) have demonstrated impressive instruction following capabilities, but struggle to accurately manage the length of generated text. We propose a novel iterative sampling framework for text length control, integrating the Metropolis-Hastings algorithm with an importance sampling acceleration strategy. Our framework achieves almost 100% success rates of length control on Llama3.1 for tasks such as length-controlled abstractive summarization.
arXiv Detail & Related papers (2024-12-19T09:07:38Z)
PositionID: LLMs can Control Lengths, Copy and Paste with Explicit Positional Awareness [41.87219806677628]
Large Language Models (LLMs) demonstrate impressive capabilities across various domains. Despite these advancements, LLMs still encounter challenges with length control. We propose novel approaches to address this issue, including PositionID Prompting and PositionID Fine-Tuning.
arXiv Detail & Related papers (2024-10-09T16:15:36Z)
Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models [14.175953642749649]
Large language models often struggle to generate responses of a specific length. We introduce a novel, model-agnostic approach called Ruler to enhance the instruction-following ability of large language models under length-constrained instructions.
arXiv Detail & Related papers (2024-09-27T17:44:58Z)
SirLLM: Streaming Infinite Retentive LLM [74.40196814292426]
Large Language Models (LLMs) process inputs of any length and maintain a degree of memory. Recent efforts have employed streaming inputs to alleviate the pressure of excessively long text inputs. We introduce Streaming Infinite Retentive LLM (SirLLM), which allows LLMs to maintain longer memory during infinite-length dialogues.
arXiv Detail & Related papers (2024-05-21T06:37:03Z)
LongHeads: Multi-Head Attention is Secretly a Long Context Processor [49.1661870007655]
LongHeads is a training-free framework that enhances large language models' long context ability. Instead of allowing each head to attend to the full sentence, we allow each head to process in-distribution length by selecting and attending to context chunks. LongHeads achieves 100% accuracy at the 128k length on passkey retrieval task.
arXiv Detail & Related papers (2024-02-16T13:39:34Z)
LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models [83.98062659664785]
Large language models (LLMs) typically train on short text segments (e.g., 4K tokens) due to the quadratic complexity of their Transformer architectures. This work identifies three major factors contributing to this length generalization failure. We propose LM-Infinite, a simple and effective method for enhancing LLMs' capabilities of handling long contexts.
arXiv Detail & Related papers (2023-08-30T16:47:51Z)
Prompt-Based Length Controlled Generation with Reinforcement Learning [48.49553921757085]
We propose a prompt-based length control method to achieve high-accuracy length controlled generation. We adopt reinforcement learning with the reward signal given by either trainable or rule-based reward models. Our method significantly improves the accuracy of prompt-based length control for summarization task on popular datasets like CNNDM and NYT.
arXiv Detail & Related papers (2023-08-23T09:43:10Z)
PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents [78.27865456183397]
We propose PEARL, a prompting framework to improve reasoning over long documents. Each stage of PEARL is implemented via zero-shot or few-shot prompting with minimal human input. We evaluate PEARL on a challenging subset of the QuALITY dataset, which contains questions that require complex reasoning over long narrative texts.
arXiv Detail & Related papers (2023-05-23T23:06:04Z)
LenAtten: An Effective Length Controlling Unit For Text Summarization [5.554982420311913]
Fixed length summarization aims at generating summaries with a preset number of words or characters. Most recent researches incorporate length information with word embeddings as the input to the recurrent decoding unit. We present an effective length controlling unit Length Attention (LenAtten) to break this trade-off.
arXiv Detail & Related papers (2021-06-01T08:45:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.