Extractive is not Faithful: An Investigation of Broad Unfaithfulness
Problems in Extractive Summarization
- URL: http://arxiv.org/abs/2209.03549v2
- Date: Tue, 30 May 2023 02:06:11 GMT
- Title: Extractive is not Faithful: An Investigation of Broad Unfaithfulness
Problems in Extractive Summarization
- Authors: Shiyue Zhang, David Wan, Mohit Bansal
- Abstract summary: In this work, we define a typology with five types of broad unfaithfulness problems that can appear in extractive summaries.
We ask humans to label these problems out of 1600 English summaries produced by 16 diverse extractive systems.
To automatically detect these problems, we find that 5 existing faithfulness evaluation metrics for summarization have poor correlations with human judgment.
- Score: 91.86501509439815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problems of unfaithful summaries have been widely discussed under the
context of abstractive summarization. Though extractive summarization is less
prone to the common unfaithfulness issues of abstractive summaries, does that
mean extractive is equal to faithful? Turns out that the answer is no. In this
work, we define a typology with five types of broad unfaithfulness problems
(including and beyond not-entailment) that can appear in extractive summaries,
including incorrect coreference, incomplete coreference, incorrect discourse,
incomplete discourse, as well as other misleading information. We ask humans to
label these problems out of 1600 English summaries produced by 16 diverse
extractive systems. We find that 30% of the summaries have at least one of the
five issues. To automatically detect these problems, we find that 5 existing
faithfulness evaluation metrics for summarization have poor correlations with
human judgment. To remedy this, we propose a new metric, ExtEval, that is
designed for detecting unfaithful extractive summaries and is shown to have the
best performance. We hope our work can increase the awareness of unfaithfulness
problems in extractive summarization and help future work to evaluate and
resolve these issues. Our data and code are publicly available at
https://github.com/ZhangShiyue/extractive_is_not_faithful
Related papers
- FABLES: Evaluating faithfulness and content selection in book-length summarization [55.50680057160788]
In this paper, we conduct the first large-scale human evaluation of faithfulness and content selection on book-length documents.
We collect FABLES, a dataset of annotations on 3,158 claims made in LLM-generated summaries of 26 books, at a cost of $5.2K USD.
An analysis of the annotations reveals that most unfaithful claims relate to events and character states, and they generally require indirect reasoning over the narrative to invalidate.
arXiv Detail & Related papers (2024-04-01T17:33:38Z) - On Context Utilization in Summarization with Large Language Models [83.84459732796302]
Large language models (LLMs) excel in abstractive summarization tasks, delivering fluent and pertinent summaries.
Recent advancements have extended their capabilities to handle long-input contexts, exceeding 100k tokens.
We conduct the first comprehensive study on context utilization and position bias in summarization.
arXiv Detail & Related papers (2023-10-16T16:45:12Z) - Generating Multiple-Length Summaries via Reinforcement Learning for
Unsupervised Sentence Summarization [44.835811239393244]
Sentence summarization shortens given texts while maintaining core contents of the texts.
Unsupervised approaches have been studied to summarize texts without human-written summaries.
We devise an abstractive model based on reinforcement learning without ground-truth summaries.
arXiv Detail & Related papers (2022-12-21T08:34:28Z) - Improving Faithfulness of Abstractive Summarization by Controlling
Confounding Effect of Irrelevant Sentences [38.919090721583075]
We show that factual inconsistency can be caused by irrelevant parts of the input text, which act as confounders.
We design a simple multi-task model to control such confounding by leveraging human-annotated relevant sentences when available.
Our approach improves faithfulness scores by 20% over strong baselines on AnswerSumm citepfabbri 2021answersumm dataset.
arXiv Detail & Related papers (2022-12-19T18:51:06Z) - Salience Allocation as Guidance for Abstractive Summarization [61.31826412150143]
We propose a novel summarization approach with a flexible and reliable salience guidance, namely SEASON (SaliencE Allocation as Guidance for Abstractive SummarizatiON)
SEASON utilizes the allocation of salience expectation to guide abstractive summarization and adapts well to articles in different abstractiveness.
arXiv Detail & Related papers (2022-10-22T02:13:44Z) - AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer
Summarization [73.91543616777064]
Community Question Answering (CQA) fora such as Stack Overflow and Yahoo! Answers contain a rich resource of answers to a wide range of community-based questions.
One goal of answer summarization is to produce a summary that reflects the range of answer perspectives.
This work introduces a novel dataset of 4,631 CQA threads for answer summarization, curated by professional linguists.
arXiv Detail & Related papers (2021-11-11T21:48:02Z) - FEQA: A Question Answering Evaluation Framework for Faithfulness
Assessment in Abstractive Summarization [34.2456005415483]
We tackle the problem of evaluating faithfulness of a generated summary given its source document.
We find that current models exhibit a trade-off between abstractiveness and faithfulness.
We propose an automatic question answering (QA) based metric for faithfulness.
arXiv Detail & Related papers (2020-05-07T21:00:08Z) - At Which Level Should We Extract? An Empirical Analysis on Extractive
Document Summarization [110.54963847339775]
We show that unnecessity and redundancy issues exist when extracting full sentences.
We propose extracting sub-sentential units based on the constituency parsing tree.
arXiv Detail & Related papers (2020-04-06T13:35:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.