Don't Say What You Don't Know: Improving the Consistency of Abstractive
Summarization by Constraining Beam Search
- URL: http://arxiv.org/abs/2203.08436v2
- Date: Fri, 17 Nov 2023 16:46:38 GMT
- Title: Don't Say What You Don't Know: Improving the Consistency of Abstractive
Summarization by Constraining Beam Search
- Authors: Daniel King, Zejiang Shen, Nishant Subramani, Daniel S. Weld, Iz
Beltagy, Doug Downey
- Abstract summary: We analyze the connection between hallucinations and training data, and find evidence that models hallucinate because they train on target summaries that are unsupported by the source.
We present PINOCCHIO, a new decoding method that improves the consistency of a transformer-based abstractive summarizer by constraining beam search to avoid hallucinations.
- Score: 54.286450484332505
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Abstractive summarization systems today produce fluent and relevant output,
but often "hallucinate" statements not supported by the source text. We analyze
the connection between hallucinations and training data, and find evidence that
models hallucinate because they train on target summaries that are unsupported
by the source. Based on our findings, we present PINOCCHIO, a new decoding
method that improves the consistency of a transformer-based abstractive
summarizer by constraining beam search to avoid hallucinations. Given the model
states and outputs at a given step, PINOCCHIO detects likely model
hallucinations based on various measures of attribution to the source text.
PINOCCHIO backtracks to find more consistent output, and can opt to produce no
summary at all when no consistent generation can be found. In experiments, we
find that PINOCCHIO improves the consistency of generation (in terms of F1) by
an average of~67% on two abstractive summarization datasets.
Related papers
- Alleviating Hallucinations of Large Language Models through Induced
Hallucinations [67.35512483340837]
Large language models (LLMs) have been observed to generate responses that include inaccurate or fabricated information.
We propose a simple textitInduce-then-Contrast Decoding (ICD) strategy to alleviate hallucinations.
arXiv Detail & Related papers (2023-12-25T12:32:49Z) - HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data [102.56792377624927]
hallucinations inherent in machine-generated data remain under-explored.
We present a novel hallucination detection and elimination framework, HalluciDoctor, based on the cross-checking paradigm.
Our method successfully mitigates 44.6% hallucinations relatively and maintains competitive performance compared to LLaVA.
arXiv Detail & Related papers (2023-11-22T04:52:58Z) - Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization [37.55557353462219]
Pruning is a technique that reduces model size by removing redundant weights, enabling more efficient sparse inference.
This paper provides an empirical study across five summarization datasets, two state-of-the-art pruning methods, and five instruction-tuned LLMs.
Surprisingly, we find that hallucinations are less prevalent from pruned LLMs than the original models.
arXiv Detail & Related papers (2023-11-15T19:49:24Z) - Hallucination Reduction in Long Input Text Summarization [2.6745438139282283]
Hallucination in text summarization poses significant obstacles to the accuracy and reliability of the generated summaries.
We have incorporated the techniques of data filtering and joint entity and summary generation (JAENS) in the fine-tuning of the Longformer-Decoder (LED) model.
Our experiments show that the fine-tuned LED model performs well in generating the paper abstract.
arXiv Detail & Related papers (2023-09-28T18:22:16Z) - Improved Beam Search for Hallucination Mitigation in Abstractive
Summarization [1.2328446298523066]
In this paper, we investigate the use of the Natural Language Inference (NLI) entailment metric to detect and prevent hallucinations in summary generation.
We propose an NLI-assisted beam re-ranking mechanism by computing entailment probability scores between the input context and summarization model-generated beams.
Our proposed algorithm significantly outperforms vanilla beam decoding on XSum and CNN/DM datasets.
arXiv Detail & Related papers (2022-12-06T02:33:47Z) - Mutual Information Alleviates Hallucinations in Abstractive
Summarization [73.48162198041884]
We find a simple criterion under which models are significantly more likely to assign more probability to hallucinated content during generation: high model uncertainty.
This finding offers a potential explanation for hallucinations: models default to favoring text with high marginal probability, when uncertain about a continuation.
We propose a decoding strategy that switches to optimizing for pointwise mutual information of the source and target token--rather than purely the probability of the target token--when the model exhibits uncertainty.
arXiv Detail & Related papers (2022-10-24T13:30:54Z) - Inspecting the Factuality of Hallucinated Entities in Abstractive
Summarization [36.052622624166894]
State-of-the-art abstractive summarization systems often generate emphhallucinations; i.e., content that is not directly inferable from the source text.
We propose a novel detection approach that separates factual from non-factual hallucinations of entities.
arXiv Detail & Related papers (2021-08-30T15:40:52Z) - Improving Faithfulness in Abstractive Summarization with Contrast
Candidate Generation and Selection [54.38512834521367]
We study contrast candidate generation and selection as a model-agnostic post-processing technique.
We learn a discriminative correction model by generating alternative candidate summaries.
This model is then used to select the best candidate as the final output summary.
arXiv Detail & Related papers (2021-04-19T05:39:24Z) - Detecting Hallucinated Content in Conditional Neural Sequence Generation [165.68948078624499]
We propose a task to predict whether each token in the output sequence is hallucinated (not contained in the input)
We also introduce a method for learning to detect hallucinations using pretrained language models fine tuned on synthetic data.
arXiv Detail & Related papers (2020-11-05T00:18:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.