Related papers: Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search

Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search

URL: http://arxiv.org/abs/2203.08436v2
Date: Fri, 17 Nov 2023 16:46:38 GMT
Title: Don't Say What You Don't Know: Improving the Consistency of Abstractive Summarization by Constraining Beam Search
Authors: Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz Beltagy, Doug Downey
Abstract summary: We analyze the connection between hallucinations and training data, and find evidence that models hallucinate because they train on target summaries that are unsupported by the source. We present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.
Score: 54.286450484332505
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Abstractive summarization systems today produce fluent and relevant output, but often "hallucinate" statements not supported by the source text. We analyze the connection between hallucinations and training data, and find evidence that models hallucinate because they train on target summaries that are unsupported by the source. Based on our findings, we present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations. Given the model states and outputs at a given step, PINOCCHIO detects likely model hallucinations based on various measures of attribution to the source text. PINOCCHIO backtracks to find more consistent output, and can opt to produce no summary at all when no consistent generation can be found. In experiments, we find that PINOCCHIO improves the consistency of generation (in terms of F1) by an average of~67% on two abstractive summarization datasets.

Related papers

Mitigating Object Hallucinations via Sentence-Level Early Intervention [10.642552315531404]
Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations.<n>We propose SENTINEL, a framework that eliminates dependency on human annotations.<n>Sentence-level Early iNtervention Through IN-domain prEference Learning can reduce hallucinations by over 90% compared to the original model.
arXiv Detail & Related papers (2025-07-16T17:55:43Z)
HalluLens: LLM Hallucination Benchmark [49.170128733508335]
Large language models (LLMs) often generate responses that deviate from user input or training data, a phenomenon known as "hallucination" This paper introduces a comprehensive hallucination benchmark, incorporating both new extrinsic and existing intrinsic evaluation tasks.
arXiv Detail & Related papers (2025-04-24T13:40:27Z)
Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling [67.14942827452161]
Vision-Language Models (VLMs) excel at visual understanding but often suffer from visual hallucinations. In this work, we introduce REVERSE, a unified framework that integrates hallucination-aware training with on-the-fly self-verification.
arXiv Detail & Related papers (2025-04-17T17:59:22Z)
Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations [82.42811602081692]
This paper introduces a subsequence association framework to systematically trace and understand hallucinations. Key insight is hallucinations that arise when dominant hallucinatory associations outweigh faithful ones. We propose a tracing algorithm that identifies causal subsequences by analyzing hallucination probabilities across randomized input contexts.
arXiv Detail & Related papers (2025-04-17T06:34:45Z)
Alleviating Hallucinations of Large Language Models through Induced Hallucinations [67.35512483340837]
Large language models (LLMs) have been observed to generate responses that include inaccurate or fabricated information. We propose a simple textitInduce-then-Contrast Decoding (ICD) strategy to alleviate hallucinations.
arXiv Detail & Related papers (2023-12-25T12:32:49Z)
HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [102.56792377624927]
hallucinations inherent in machine-generated data remain under-explored. We present a novel hallucination detection and elimination framework, HalluciDoctor, based on the cross-checking paradigm. Our method successfully mitigates 44.6% hallucinations relatively and maintains competitive performance compared to LLaVA.
arXiv Detail & Related papers (2023-11-22T04:52:58Z)
Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization [37.55557353462219]
Pruning is a technique that reduces model size by removing redundant weights, enabling more efficient sparse inference. This paper provides an empirical study across five summarization datasets, two state-of-the-art pruning methods, and five instruction-tuned LLMs. Surprisingly, we find that hallucinations are less prevalent from pruned LLMs than the original models.
arXiv Detail & Related papers (2023-11-15T19:49:24Z)
Hallucination Reduction in Long Input Text Summarization [2.6745438139282283]
Hallucination in text summarization poses significant obstacles to the accuracy and reliability of the generated summaries. We have incorporated the techniques of data filtering and joint entity and summary generation (JAENS) in the fine-tuning of the Longformer-Decoder (LED) model. Our experiments show that the fine-tuned LED model performs well in generating the paper abstract.
arXiv Detail & Related papers (2023-09-28T18:22:16Z)
Improved Beam Search for Hallucination Mitigation in Abstractive Summarization [1.2328446298523066]
In this paper, we investigate the use of the Natural Language Inference (NLI) entailment metric to detect and prevent hallucinations in summary generation. We propose an NLI-assisted beam re-ranking mechanism by computing entailment probability scores between the input context and summarization model-generated beams. Our proposed algorithm significantly outperforms vanilla beam decoding on XSum and CNN/DM datasets.
arXiv Detail & Related papers (2022-12-06T02:33:47Z)
Mutual Information Alleviates Hallucinations in Abstractive Summarization [73.48162198041884]
We find a simple criterion under which models are significantly more likely to assign more probability to hallucinated content during generation: high model uncertainty. This finding offers a potential explanation for hallucinations: models default to favoring text with high marginal probability, when uncertain about a continuation. We propose a decoding strategy that switches to optimizing for pointwise mutual information of the source and target token--rather than purely the probability of the target token--when the model exhibits uncertainty.
arXiv Detail & Related papers (2022-10-24T13:30:54Z)
Inspecting the Factuality of Hallucinated Entities in Abstractive Summarization [36.052622624166894]
State-of-the-art abstractive summarization systems often generate emphhallucinations; i.e., content that is not directly inferable from the source text. We propose a novel detection approach that separates factual from non-factual hallucinations of entities.
arXiv Detail & Related papers (2021-08-30T15:40:52Z)
Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection [54.38512834521367]
We study contrast candidate generation and selection as a model-agnostic post-processing technique. We learn a discriminative correction model by generating alternative candidate summaries. This model is then used to select the best candidate as the final output summary.
arXiv Detail & Related papers (2021-04-19T05:39:24Z)
Detecting Hallucinated Content in Conditional Neural Sequence Generation [165.68948078624499]
We propose a task to predict whether each token in the output sequence is hallucinated (not contained in the input) We also introduce a method for learning to detect hallucinations using pretrained language models fine tuned on synthetic data.
arXiv Detail & Related papers (2020-11-05T00:18:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.