Related papers: Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization

Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization

URL: http://arxiv.org/abs/2410.02741v2
Date: Mon, 02 Dec 2024 21:06:29 GMT
Title: Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization
Authors: Lei Xu, Mohammed Asad Karim, Saket Dingliwal, Aparna Elangovan,
Abstract summary: Large language models (LLMs) can generate fluent summaries across domains using prompting techniques.<n>We show that adding keyphrases in prompts can improve ROUGE F1 and recall.<n>We introduce Keyphrase Signal Extractor (SigExt), a lightweight model that can be finetuned to extract salient keyphrases.
Score: 4.9201947803787744
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) can generate fluent summaries across domains using prompting techniques, reducing the need to train models for summarization applications. However, crafting effective prompts that guide LLMs to generate summaries with the appropriate level of detail and writing style remains a challenge. In this paper, we explore the use of salient information extracted from the source document to enhance summarization prompts. We show that adding keyphrases in prompts can improve ROUGE F1 and recall, making the generated summaries more similar to the reference and more complete. The number of keyphrases can control the precision-recall trade-off. Furthermore, our analysis reveals that incorporating phrase-level salient information is superior to word- or sentence-level. However, the impact on hallucination is not universally positive across LLMs. To conduct this analysis, we introduce Keyphrase Signal Extractor (SigExt), a lightweight model that can be finetuned to extract salient keyphrases. By using SigExt, we achieve consistent ROUGE improvements across datasets and open-weight and proprietary LLMs without any LLM customization. Our findings provide insights into leveraging salient information in building prompt-based summarization systems. We release our code at \url{https://github.com/amazon-science/SigExt}

Related papers

SelfRACG: Enabling LLMs to Self-Express and Retrieve for Code Generation [63.4105693174085]
We propose textbfSelfRACG, a novel paradigm that enables large language models (LLMs) to express their information needs to enhance textbfRACG.<n>SelfRACG includes an information need expression module and a two-stage information need-guided training strategy, which encourages LLMs to express their information need.<n>Extensive experiments demonstrate that SelfRACG can retrieve external knowledge that better aligns with the LLM's own information needs, resulting in superior generation performance compared to vanilla RACG.
arXiv Detail & Related papers (2025-07-25T07:42:01Z)
RALLRec+: Retrieval Augmented Large Language Model Recommendation with Reasoning [22.495874056980824]
We propose Representation learning and textbfReasoning empowered retrieval-textbfAugmented textbfLarge textbfLanguage model textbfRecommendation (RALLRec+).
arXiv Detail & Related papers (2025-03-26T11:03:34Z)
Zero-Shot Keyphrase Generation: Investigating Specialized Instructions and Multi-Sample Aggregation on Large Language Models [52.829293635314194]
Keyphrase generation is a long-standing NLP task for automatically generating keyphrases for a given document. We focus on the zero-shot capabilities of open-source instruction-tuned LLMs (Phi-3, Llama-3) and the closed-source GPT-4o for this task.
arXiv Detail & Related papers (2025-03-01T19:38:57Z)
Unstructured Evidence Attribution for Long Context Query Focused Summarization [53.08341620504465]
We propose to extract unstructured (i.e., spans of any length) evidence in order to acquire more relevant and consistent evidence than in the fixed granularity case.<n>We show how existing systems struggle to copy and properly cite unstructured evidence, which also tends to be "lost-in-the-middle"
arXiv Detail & Related papers (2025-02-20T09:57:42Z)
Idiosyncrasies in Large Language Models [54.26923012617675]
We unveil and study idiosyncrasies in Large Language Models (LLMs) We find that fine-tuning existing text embedding models on LLM-generated texts yields excellent classification accuracy. We leverage LLM as judges to generate detailed, open-ended descriptions of each model's idiosyncrasies.
arXiv Detail & Related papers (2025-02-17T18:59:02Z)
Scaling Up Summarization: Leveraging Large Language Models for Long Text Extractive Summarization [0.27624021966289597]
This paper introduces EYEGLAXS, a framework that leverages Large Language Models (LLMs) for extractive summarization. EYEGLAXS focuses on extractive summarization to ensure factual and grammatical integrity. The system sets new performance benchmarks on well-known datasets like PubMed and ArXiv.
arXiv Detail & Related papers (2024-08-28T13:52:19Z)
Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation [51.8188846284153]
RAG has been widely adopted to enhance Large Language Models (LLMs) Attributed Text Generation (ATG) has attracted growing attention, which provides citations to support the model's responses in RAG. This paper proposes a fine-grained ATG method called ReClaim(Refer & Claim), which alternates the generation of references and answers step by step.
arXiv Detail & Related papers (2024-07-01T20:47:47Z)
Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering [9.86691461253151]
We introduce a novel method for attribution in contextual question answering, leveraging the hidden state representations of large language models (LLMs) Our approach bypasses the need for extensive model retraining and retrieval model overhead, offering granular attributions and preserving the quality of generated answers. We present Verifiability-granular, an attribution dataset which has token level annotations for LLM generations in the contextual question answering setup.
arXiv Detail & Related papers (2024-05-28T09:12:44Z)
Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation [128.01050030936028]
We propose an information refinement training method named InFO-RAG. InFO-RAG is low-cost and general across various tasks. It improves the performance of LLaMA2 by an average of 9.39% relative points.
arXiv Detail & Related papers (2024-02-28T08:24:38Z)
Large Language Model with Graph Convolution for Recommendation [21.145230388035277]
Text information can sometimes be of low quality, hindering its effectiveness for real-world applications. With knowledge and reasoning capabilities capsuled in Large Language Models, utilizing LLMs emerges as a promising way for description improvement. We propose a Graph-aware Convolutional LLM method to elicit LLMs to capture high-order relations in the user-item graph.
arXiv Detail & Related papers (2024-02-14T00:04:33Z)
Learning to Prompt with Text Only Supervision for Vision-Language Models [107.282881515667]
One branch of methods adapts CLIP by learning prompts using visual information. An alternative approach resorts to training-free methods by generating class descriptions from large language models. We propose to combine the strengths of both streams by learning prompts using only text data.
arXiv Detail & Related papers (2024-01-04T18:59:49Z)
Effective Large Language Model Adaptation for Improved Grounding and Citation Generation [48.07830615309543]
This paper focuses on improving large language models (LLMs) by grounding their responses in retrieved passages and by providing citations. We propose a new framework, AGREE, that improves the grounding from a holistic perspective. Our framework tunes LLMs to selfground the claims in their responses and provide accurate citations to retrieved documents.
arXiv Detail & Related papers (2023-11-16T03:22:25Z)
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning [51.90524745663737]
A key innovation is our use of explanations as features, which can be used to boost GNN performance on downstream tasks. Our method achieves state-of-the-art results on well-established TAG datasets. Our method significantly speeds up training, achieving a 2.88 times improvement over the closest baseline on ogbn-arxiv.
arXiv Detail & Related papers (2023-05-31T03:18:03Z)
Element-aware Summarization with Large Language Models: Expert-aligned Evaluation and Chain-of-Thought Method [35.181659789684545]
Automatic summarization generates concise summaries that contain key ideas of source documents. References from CNN/DailyMail and BBC XSum are noisy, mainly in terms of factual hallucination and information redundancy. We propose a Summary Chain-of-Thought (SumCoT) technique to elicit LLMs to generate summaries step by step. Experimental results show our method outperforms state-of-the-art fine-tuned PLMs and zero-shot LLMs by +4.33/+4.77 in ROUGE-L.
arXiv Detail & Related papers (2023-05-22T18:54:35Z)
Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback [127.75419038610455]
Large language models (LLMs) are able to generate human-like, fluent responses for many downstream tasks. This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules.
arXiv Detail & Related papers (2023-02-24T18:48:43Z)
Guiding Large Language Models via Directional Stimulus Prompting [114.84930073977672]
We introduce Directional Stimulus Prompting, a novel framework for guiding black-box large language models (LLMs) toward specific desired outputs. Instead of directly adjusting LLMs, our method employs a small tunable policy model to generate an auxiliary directional stimulus prompt for each input instance.
arXiv Detail & Related papers (2023-02-22T17:44:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.