Correction with Backtracking Reduces Hallucination in Summarization
- URL: http://arxiv.org/abs/2310.16176v2
- Date: Tue, 31 Oct 2023 14:48:14 GMT
- Title: Correction with Backtracking Reduces Hallucination in Summarization
- Authors: Zhenzhen Liu, Chao Wan, Varsha Kishore, Jin Peng Zhou, Minmin Chen,
Kilian Q. Weinberger
- Abstract summary: We introduce a simple yet efficient technique, CoBa, to reduce hallucination in abstractive summarization.
The approach is based on two steps: hallucination detection and mitigation.
The results show that CoBa is effective and efficient in reducing hallucination, and offers great adaptability and flexibility.
- Score: 30.827500697135118
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Abstractive summarization aims at generating natural language summaries of a
source document that are succinct while preserving the important elements.
Despite recent advances, neural text summarization models are known to be
susceptible to hallucinating (or more correctly confabulating), that is to
produce summaries with details that are not grounded in the source document. In
this paper, we introduce a simple yet efficient technique, CoBa, to reduce
hallucination in abstractive summarization. The approach is based on two steps:
hallucination detection and mitigation. We show that the former can be achieved
through measuring simple statistics about conditional word probabilities and
distance to context words. Further, we demonstrate that straight-forward
backtracking is surprisingly effective at mitigation. We thoroughly evaluate
the proposed method with prior art on three benchmark datasets for text
summarization. The results show that CoBa is effective and efficient in
reducing hallucination, and offers great adaptability and flexibility.
Related papers
- Hallucination Reduction in Long Input Text Summarization [2.6745438139282283]
Hallucination in text summarization poses significant obstacles to the accuracy and reliability of the generated summaries.
We have incorporated the techniques of data filtering and joint entity and summary generation (JAENS) in the fine-tuning of the Longformer-Decoder (LED) model.
Our experiments show that the fine-tuned LED model performs well in generating the paper abstract.
arXiv Detail & Related papers (2023-09-28T18:22:16Z) - NapSS: Paragraph-level Medical Text Simplification via Narrative
Prompting and Sentence-matching Summarization [46.772517928718216]
We propose a summarize-then-simplify two-stage strategy, which we call NapSS.
NapSS identifies the relevant content to simplify while ensuring that the original narrative flow is preserved.
Our model achieves significantly better than the seq2seq baseline on an English medical corpus.
arXiv Detail & Related papers (2023-02-11T02:20:25Z) - Improving Faithfulness of Abstractive Summarization by Controlling
Confounding Effect of Irrelevant Sentences [38.919090721583075]
We show that factual inconsistency can be caused by irrelevant parts of the input text, which act as confounders.
We design a simple multi-task model to control such confounding by leveraging human-annotated relevant sentences when available.
Our approach improves faithfulness scores by 20% over strong baselines on AnswerSumm citepfabbri 2021answersumm dataset.
arXiv Detail & Related papers (2022-12-19T18:51:06Z) - Mutual Information Alleviates Hallucinations in Abstractive
Summarization [73.48162198041884]
We find a simple criterion under which models are significantly more likely to assign more probability to hallucinated content during generation: high model uncertainty.
This finding offers a potential explanation for hallucinations: models default to favoring text with high marginal probability, when uncertain about a continuation.
We propose a decoding strategy that switches to optimizing for pointwise mutual information of the source and target token--rather than purely the probability of the target token--when the model exhibits uncertainty.
arXiv Detail & Related papers (2022-10-24T13:30:54Z) - Don't Say What You Don't Know: Improving the Consistency of Abstractive
Summarization by Constraining Beam Search [54.286450484332505]
We analyze the connection between hallucinations and training data, and find evidence that models hallucinate because they train on target summaries that are unsupported by the source.
We present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.
arXiv Detail & Related papers (2022-03-16T07:13:52Z) - Inspecting the Factuality of Hallucinated Entities in Abstractive
Summarization [36.052622624166894]
State-of-the-art abstractive summarization systems often generate emphhallucinations; i.e., content that is not directly inferable from the source text.
We propose a novel detection approach that separates factual from non-factual hallucinations of entities.
arXiv Detail & Related papers (2021-08-30T15:40:52Z) - Detecting Hallucinated Content in Conditional Neural Sequence Generation [165.68948078624499]
We propose a task to predict whether each token in the output sequence is hallucinated (not contained in the input)
We also introduce a method for learning to detect hallucinations using pretrained language models fine tuned on synthetic data.
arXiv Detail & Related papers (2020-11-05T00:18:53Z) - Reducing Quantity Hallucinations in Abstractive Summarization [32.89486074807331]
Herman learns to recognize and verify quantity entities (dates, numbers, sums of money, etc.) in a beam-worth of abstractive summaries.
Experimental results demonstrate that the ROUGE scores of such up-ranked summaries have a higher Precision than summaries that have not been up-ranked.
Preliminary human evaluation of up-ranked vs. original summaries shows people's preference for the former.
arXiv Detail & Related papers (2020-09-28T13:32:59Z) - Unsupervised Opinion Summarization with Noising and Denoising [85.49169453434554]
We create a synthetic dataset from a corpus of user reviews by sampling a review, pretending it is a summary, and generating noisy versions thereof.
At test time, the model accepts genuine reviews and generates a summary containing salient opinions, treating those that do not reach consensus as noise.
arXiv Detail & Related papers (2020-04-21T16:54:57Z) - Pre-training for Abstractive Document Summarization by Reinstating
Source Text [105.77348528847337]
This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text.
Experiments on two benchmark summarization datasets show that all three objectives can improve performance upon baselines.
arXiv Detail & Related papers (2020-04-04T05:06:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.