Factual Consistency Evaluation for Text Summarization via Counterfactual
Estimation
- URL: http://arxiv.org/abs/2108.13134v1
- Date: Mon, 30 Aug 2021 11:48:41 GMT
- Title: Factual Consistency Evaluation for Text Summarization via Counterfactual
Estimation
- Authors: Yuexiang Xie, Fei Sun, Yang Deng, Yaliang Li, Bolin Ding
- Abstract summary: We propose a novel metric to evaluate the factual consistency in text summarization via counterfactual estimation.
We conduct a series of experiments on three public abstractive text summarization datasets.
- Score: 42.63902468258758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite significant progress has been achieved in text summarization, factual
inconsistency in generated summaries still severely limits its practical
applications. Among the key factors to ensure factual consistency, a reliable
automatic evaluation metric is the first and the most crucial one. However,
existing metrics either neglect the intrinsic cause of the factual
inconsistency or rely on auxiliary tasks, leading to an unsatisfied correlation
with human judgments or increasing the inconvenience of usage in practice. In
light of these challenges, we propose a novel metric to evaluate the factual
consistency in text summarization via counterfactual estimation, which
formulates the causal relationship among the source document, the generated
summary, and the language prior. We remove the effect of language prior, which
can cause factual inconsistency, from the total causal effect on the generated
summary, and provides a simple yet effective way to evaluate consistency
without relying on other auxiliary tasks. We conduct a series of experiments on
three public abstractive text summarization datasets, and demonstrate the
advantages of the proposed metric in both improving the correlation with human
judgments and the convenience of usage. The source code is available at
https://github.com/xieyxclack/factual_coco.
Related papers
- Using Similarity to Evaluate Factual Consistency in Summaries [2.7595794227140056]
Abstractive summarisers generate fluent summaries, but the factuality of the generated text is not guaranteed.
We propose a new zero-shot factuality evaluation metric, Sentence-BERTScore (SBERTScore), which compares sentences between the summary and the source document.
Our experiments indicate that each technique has different strengths, with SBERTScore particularly effective in identifying correct summaries.
arXiv Detail & Related papers (2024-09-23T15:02:38Z) - FENICE: Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction [85.26780391682894]
We propose Factuality Evaluation of summarization based on Natural language Inference and Claim Extraction (FENICE)
FENICE leverages an NLI-based alignment between information in the source document and a set of atomic facts, referred to as claims, extracted from the summary.
Our metric sets a new state of the art on AGGREFACT, the de-facto benchmark for factuality evaluation.
arXiv Detail & Related papers (2024-03-04T17:57:18Z) - AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation [57.8363998797433]
We propose AMRFact, a framework that generates perturbed summaries using Abstract Meaning Representations (AMRs)
Our approach parses factually consistent summaries into AMR graphs and injects controlled factual inconsistencies to create negative examples, allowing for coherent factually inconsistent summaries to be generated with high error-type coverage.
arXiv Detail & Related papers (2023-11-16T02:56:29Z) - Factually Consistent Summarization via Reinforcement Learning with
Textual Entailment Feedback [57.816210168909286]
We leverage recent progress on textual entailment models to address this problem for abstractive summarization systems.
We use reinforcement learning with reference-free, textual entailment rewards to optimize for factual consistency.
Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience, and conciseness of the generated summaries.
arXiv Detail & Related papers (2023-05-31T21:04:04Z) - Counterfactual Debiasing for Generating Factually Consistent Text
Summaries [46.88138136539263]
We construct causal graphs for abstractive text summarization and identify the intrinsic causes of the factual inconsistency.
We propose a debiasing framework, named CoFactSum, to alleviate the causal effects of these biases by counterfactual estimation.
Experiments on two widely-used summarization datasets demonstrate the effectiveness of CoFactSum.
arXiv Detail & Related papers (2023-05-18T06:15:45Z) - Improving Faithfulness of Abstractive Summarization by Controlling
Confounding Effect of Irrelevant Sentences [38.919090721583075]
We show that factual inconsistency can be caused by irrelevant parts of the input text, which act as confounders.
We design a simple multi-task model to control such confounding by leveraging human-annotated relevant sentences when available.
Our approach improves faithfulness scores by 20% over strong baselines on AnswerSumm citepfabbri 2021answersumm dataset.
arXiv Detail & Related papers (2022-12-19T18:51:06Z) - Improving Factual Consistency of Abstractive Summarization via Question
Answering [25.725873545789046]
We present an approach to address factual consistency in summarization.
We first propose an efficient automatic evaluation metric to measure factual consistency.
We then propose a novel learning algorithm that maximizes the proposed metric during model training.
arXiv Detail & Related papers (2021-05-10T19:07:21Z) - Enhancing Factual Consistency of Abstractive Summarization [57.67609672082137]
We propose a fact-aware summarization model FASum to extract and integrate factual relations into the summary generation process.
We then design a factual corrector model FC to automatically correct factual errors from summaries generated by existing systems.
arXiv Detail & Related papers (2020-03-19T07:36:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.