Improving Faithfulness of Abstractive Summarization by Controlling
Confounding Effect of Irrelevant Sentences
- URL: http://arxiv.org/abs/2212.09726v2
- Date: Thu, 18 Jan 2024 19:27:04 GMT
- Title: Improving Faithfulness of Abstractive Summarization by Controlling
Confounding Effect of Irrelevant Sentences
- Authors: Asish Ghoshal, Arash Einolghozati, Ankit Arun, Haoran Li, Lili Yu,
Vera Gor, Yashar Mehdad, Scott Wen-tau Yih, Asli Celikyilmaz
- Abstract summary: We show that factual inconsistency can be caused by irrelevant parts of the input text, which act as confounders.
We design a simple multi-task model to control such confounding by leveraging human-annotated relevant sentences when available.
Our approach improves faithfulness scores by 20% over strong baselines on AnswerSumm citepfabbri 2021answersumm dataset.
- Score: 38.919090721583075
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Lack of factual correctness is an issue that still plagues state-of-the-art
summarization systems despite their impressive progress on generating seemingly
fluent summaries. In this paper, we show that factual inconsistency can be
caused by irrelevant parts of the input text, which act as confounders. To that
end, we leverage information-theoretic measures of causal effects to quantify
the amount of confounding and precisely quantify how they affect the
summarization performance. Based on insights derived from our theoretical
results, we design a simple multi-task model to control such confounding by
leveraging human-annotated relevant sentences when available. Crucially, we
give a principled characterization of data distributions where such confounding
can be large thereby necessitating the use of human annotated relevant
sentences to generate factual summaries. Our approach improves faithfulness
scores by 20\% over strong baselines on AnswerSumm
\citep{fabbri2021answersumm}, a conversation summarization dataset where lack
of faithfulness is a significant issue due to the subjective nature of the
task. Our best method achieves the highest faithfulness score while also
achieving state-of-the-art results on standard metrics like ROUGE and METEOR.
We corroborate these improvements through human evaluation.
Related papers
- FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction [85.26780391682894]
We propose Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction (FENICE)
FENICE leverages an NLI-based alignment between information in the source document and a set of atomic facts, referred to as claims, extracted from the summary.
Our metric sets a new state of the art on AGGREFACT, the de-facto benchmark for factuality evaluation.
arXiv Detail & Related papers (2024-03-04T17:57:18Z) - AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation [57.8363998797433]
We propose AMRFact, a framework that generates perturbed summaries using Abstract Meaning Representations (AMRs)
Our approach parses factually consistent summaries into AMR graphs and injects controlled factual inconsistencies to create negative examples, allowing for coherent factually inconsistent summaries to be generated with high error-type coverage.
arXiv Detail & Related papers (2023-11-16T02:56:29Z) - Factually Consistent Summarization via Reinforcement Learning with
Textual Entailment Feedback [57.816210168909286]
We leverage recent progress on textual entailment models to address this problem for abstractive summarization systems.
We use reinforcement learning with reference-free, textual entailment rewards to optimize for factual consistency.
Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience, and conciseness of the generated summaries.
arXiv Detail & Related papers (2023-05-31T21:04:04Z) - Interpretable Automatic Fine-grained Inconsistency Detection in Text
Summarization [56.94741578760294]
We propose the task of fine-grained inconsistency detection, the goal of which is to predict the fine-grained types of factual errors in a summary.
Motivated by how humans inspect factual inconsistency in summaries, we propose an interpretable fine-grained inconsistency detection model, FineGrainFact.
arXiv Detail & Related papers (2023-05-23T22:11:47Z) - Questioning the Validity of Summarization Datasets and Improving Their
Factual Consistency [14.974996886744083]
We release SummFC, a filtered summarization dataset with improved factual consistency.
We argue that our dataset should become a valid benchmark for developing and evaluating summarization systems.
arXiv Detail & Related papers (2022-10-31T15:04:20Z) - CONFIT: Toward Faithful Dialogue Summarization with
Linguistically-Informed Contrastive Fine-tuning [5.389540975316299]
Factual inconsistencies in generated summaries severely limit the practical applications of abstractive dialogue summarization.
We provide a typology of factual errors with annotation data to highlight the types of errors and move away from a binary understanding of factuality.
We propose a training strategy that improves the factual consistency and overall quality of summaries via a novel contrastive fine-tuning, called ConFiT.
arXiv Detail & Related papers (2021-12-16T09:08:40Z) - Factual Consistency Evaluation for Text Summarization via Counterfactual
Estimation [42.63902468258758]
We propose a novel metric to evaluate the factual consistency in text summarization via counterfactual estimation.
We conduct a series of experiments on three public abstractive text summarization datasets.
arXiv Detail & Related papers (2021-08-30T11:48:41Z) - Improving Factual Consistency of Abstractive Summarization via Question
Answering [25.725873545789046]
We present an approach to address factual consistency in summarization.
We first propose an efficient automatic evaluation metric to measure factual consistency.
We then propose a novel learning algorithm that maximizes the proposed metric during model training.
arXiv Detail & Related papers (2021-05-10T19:07:21Z) - Enhancing Factual Consistency of Abstractive Summarization [57.67609672082137]
We propose a fact-aware summarization model FASum to extract and integrate factual relations into the summary generation process.
We then design a factual corrector model FC to automatically correct factual errors from summaries generated by existing systems.
arXiv Detail & Related papers (2020-03-19T07:36:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.