LenAtten: An Effective Length Controlling Unit For Text Summarization
- URL: http://arxiv.org/abs/2106.00316v1
- Date: Tue, 1 Jun 2021 08:45:41 GMT
- Title: LenAtten: An Effective Length Controlling Unit For Text Summarization
- Authors: Zhongyi Yu, Zhenghao Wu, Hao Zheng, Zhe XuanYuan, Jefferson Fong,
Weifeng Su
- Abstract summary: Fixed length summarization aims at generating summaries with a preset number of words or characters.
Most recent researches incorporate length information with word embeddings as the input to the recurrent decoding unit.
We present an effective length controlling unit Length Attention (LenAtten) to break this trade-off.
- Score: 5.554982420311913
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Fixed length summarization aims at generating summaries with a preset number
of words or characters. Most recent researches incorporate length information
with word embeddings as the input to the recurrent decoding unit, causing a
compromise between length controllability and summary quality. In this work, we
present an effective length controlling unit Length Attention (LenAtten) to
break this trade-off. Experimental results show that LenAtten not only brings
improvements in length controllability and ROGUE scores but also has great
generalization ability. In the task of generating a summary with the target
length, our model is 732 times better than the best-performing length
controllable summarizer in length controllability on the CNN/Daily Mail
dataset.
Related papers
- LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation [79.90766312484489]
Long Context Pre-training with Restoration Distillation (LongReD)
LongReD distills the hidden state of selected layers from the original model on short texts.
Experiments on common text benchmarks demonstrate that LongReD effectively preserves the model's short-text performance.
arXiv Detail & Related papers (2025-02-11T08:37:16Z) - A Decoding Algorithm for Length-Control Summarization Based on Directed Acyclic Transformers [32.53051395472311]
Length-control summarization aims to condense long texts into a short one within a certain length limit.
Previous approaches often use autoregressive (AR) models and treat the length requirement as a soft constraint.
Our approach allows for multiple plausible sequence fragments and predicts a emphpath to connect them.
arXiv Detail & Related papers (2025-02-06T22:12:55Z) - Zero-Shot Strategies for Length-Controllable Summarization [56.15356055672189]
Large language models (LLMs) struggle with precise length control, particularly in zero-shot settings.
We conduct a comprehensive study evaluating LLMs' length control capabilities across multiple measures and propose practical methods to improve controllability.
Our experiments with LLaMA 3 reveal stark differences in length adherence across measures and highlight inherent biases of the model.
arXiv Detail & Related papers (2024-12-31T02:53:27Z) - Length Controlled Generation for Black-box LLMs [70.57649832433451]
Large language models (LLMs) have demonstrated impressive instruction following capabilities, but struggle to accurately manage the length of generated text.
We propose a novel iterative sampling framework for text length control, integrating the Metropolis-Hastings algorithm with an importance sampling acceleration strategy.
Our framework achieves almost 100% success rates of length control on Llama3.1 for tasks such as length-controlled abstractive summarization.
arXiv Detail & Related papers (2024-12-19T09:07:38Z) - Precise Length Control in Large Language Models [1.3654846342364308]
Large Language Models (LLMs) are increasingly used in production systems.
We propose a method to adapt pre-trained decoder-only LLMs for precise control of response length.
arXiv Detail & Related papers (2024-12-16T16:22:27Z) - LongAlign: A Recipe for Long Context Alignment of Large Language Models [61.85923382850057]
LongAlign is a recipe of the instruction data, training, and evaluation for long context alignment.
We construct a long instruction-following dataset using Self-Instruct.
We adopt the packing and sorted strategies to speed up supervised fine-tuning on data with varied length distributions.
arXiv Detail & Related papers (2024-01-31T18:29:39Z) - Effective Long-Context Scaling of Foundation Models [90.57254298730923]
We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens.
Our models achieve consistent improvements on most regular tasks and significant improvements on long-context tasks over Llama 2.
arXiv Detail & Related papers (2023-09-27T21:41:49Z) - Prompt-Based Length Controlled Generation with Reinforcement Learning [48.49553921757085]
We propose a prompt-based length control method to achieve high-accuracy length controlled generation.
We adopt reinforcement learning with the reward signal given by either trainable or rule-based reward models.
Our method significantly improves the accuracy of prompt-based length control for summarization task on popular datasets like CNNDM and NYT.
arXiv Detail & Related papers (2023-08-23T09:43:10Z) - Summarization with Precise Length Control [23.688834410051]
We present a framework to generate summaries with precisely the specified number of tokens or sentences.
We jointly train the models to predict the lengths, so our model can generate summaries with optimal length.
arXiv Detail & Related papers (2023-05-09T04:45:24Z) - Reinforced Abstractive Summarization with Adaptive Length Controlling [12.793451906532223]
Controllable summarization, especially of the length, is an important issue for some practical applications.
We propose an textbfAdaptive textbfLength textbfControlling textbfOptimization (textbfALCO) method to leverage two-stage abstractive summarization model.
arXiv Detail & Related papers (2021-12-14T16:48:47Z) - Length-controllable Abstractive Summarization by Guiding with Summary
Prototype [27.094797760775297]
We propose a new length-controllable abstractive summarization model.
Our model generates a summary in two steps.
Experiments with the CNN/Daily Mail dataset and the NEWSROOM dataset show that our model outperformed previous models in length-controlled settings.
arXiv Detail & Related papers (2020-01-21T04:01:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.