Related papers: Correction with Backtracking Reduces Hallucination in Summarization

Correction with Backtracking Reduces Hallucination in Summarization

URL: http://arxiv.org/abs/2310.16176v3
Date: Tue, 3 Sep 2024 19:08:18 GMT
Title: Correction with Backtracking Reduces Hallucination in Summarization
Authors: Zhenzhen Liu, Chao Wan, Varsha Kishore, Jin Peng Zhou, Minmin Chen, Kilian Q. Weinberger,
Abstract summary: Abstractive summarization aims at generating natural language summaries of a source document that are succinct while preserving the important elements. Despite recent advances, neural text summarization models are known to be susceptible to hallucinating. We introduce a simple yet efficient technique, CoBa, to reduce hallucination in abstractive summarization.
Score: 29.093092115901694
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Abstractive summarization aims at generating natural language summaries of a source document that are succinct while preserving the important elements. Despite recent advances, neural text summarization models are known to be susceptible to hallucinating (or more correctly confabulating), that is to produce summaries with details that are not grounded in the source document. In this paper, we introduce a simple yet efficient technique, CoBa, to reduce hallucination in abstractive summarization. The approach is based on two steps: hallucination detection and mitigation. We show that the former can be achieved through measuring simple statistics about conditional word probabilities and distance to context words. Further, we demonstrate that straight-forward backtracking is surprisingly effective at mitigation. We thoroughly evaluate the proposed method with prior art on three benchmark datasets for text summarization. The results show that CoBa is effective and efficient in reducing hallucination, and offers great adaptability and flexibility. Code can be found at https://github.com/zhenzhel/CoBa.

Related papers

Context-Aware Hierarchical Merging for Long Document Summarization [56.96619074316232]
We propose different approaches to enrich hierarchical merging with context from the source document. Experimental results on datasets representing legal and narrative domains show that contextual augmentation consistently outperforms zero-shot and hierarchical merging baselines.
arXiv Detail & Related papers (2025-02-03T01:14:31Z)
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models [58.952782707682815]
COFT is a novel method to focus on different-level key texts, thereby avoiding getting lost in lengthy contexts. Experiments on the knowledge hallucination benchmark demonstrate the effectiveness of COFT, leading to a superior performance over $30%$ in the F1 score metric.
arXiv Detail & Related papers (2024-10-19T13:59:48Z)
Source Identification in Abstractive Summarization [0.8883733362171033]
We define input sentences that contain essential information in the generated summary as $textitsource sentences$ and study how abstractive summaries are made by analyzing the source sentences. We formulate automatic source sentence detection and compare multiple methods to establish a strong baseline for the task. Experimental results show that the perplexity-based method performs well in highly abstractive settings, while similarity-based methods robustly in relatively extractive settings.
arXiv Detail & Related papers (2024-02-07T09:09:09Z)
NapSS: Paragraph-level Medical Text Simplification via Narrative Prompting and Sentence-matching Summarization [46.772517928718216]
We propose a summarize-then-simplify two-stage strategy, which we call NapSS. NapSS identifies the relevant content to simplify while ensuring that the original narrative flow is preserved. Our model achieves significantly better than the seq2seq baseline on an English medical corpus.
arXiv Detail & Related papers (2023-02-11T02:20:25Z)
Revisiting text decomposition methods for NLI-based factuality scoring of summaries [9.044665059626958]
We show that fine-grained decomposition is not always a winning strategy for factuality scoring. We also show that small changes to previously proposed entailment-based scoring methods can result in better performance.
arXiv Detail & Related papers (2022-11-30T09:54:37Z)
Mutual Information Alleviates Hallucinations in Abstractive Summarization [73.48162198041884]
We find a simple criterion under which models are significantly more likely to assign more probability to hallucinated content during generation: high model uncertainty. This finding offers a potential explanation for hallucinations: models default to favoring text with high marginal probability, when uncertain about a continuation. We propose a decoding strategy that switches to optimizing for pointwise mutual information of the source and target token--rather than purely the probability of the target token--when the model exhibits uncertainty.
arXiv Detail & Related papers (2022-10-24T13:30:54Z)
Inspecting the Factuality of Hallucinated Entities in Abstractive Summarization [36.052622624166894]
State-of-the-art abstractive summarization systems often generate emphhallucinations; i.e., content that is not directly inferable from the source text. We propose a novel detection approach that separates factual from non-factual hallucinations of entities.
arXiv Detail & Related papers (2021-08-30T15:40:52Z)
Detecting Hallucinated Content in Conditional Neural Sequence Generation [165.68948078624499]
We propose a task to predict whether each token in the output sequence is hallucinated (not contained in the input) We also introduce a method for learning to detect hallucinations using pretrained language models fine tuned on synthetic data.
arXiv Detail & Related papers (2020-11-05T00:18:53Z)
Reducing Quantity Hallucinations in Abstractive Summarization [32.89486074807331]
Herman learns to recognize and verify quantity entities (dates, numbers, sums of money, etc.) in a beam-worth of abstractive summaries. Experimental results demonstrate that the ROUGE scores of such up-ranked summaries have a higher Precision than summaries that have not been up-ranked. Preliminary human evaluation of up-ranked vs. original summaries shows people's preference for the former.
arXiv Detail & Related papers (2020-09-28T13:32:59Z)
Unsupervised Opinion Summarization with Noising and Denoising [85.49169453434554]
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof. At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise.
arXiv Detail & Related papers (2020-04-21T16:54:57Z)
Pre-training for Abstractive Document Summarization by Reinstating Source Text [105.77348528847337]
This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text. Experiments on two benchmark summarization datasets show that all three objectives can improve performance upon baselines.
arXiv Detail & Related papers (2020-04-04T05:06:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.