Improving Factual Error Correction for Abstractive Summarization via
Data Distillation and Conditional-generation Cloze
- URL: http://arxiv.org/abs/2402.08581v1
- Date: Tue, 13 Feb 2024 16:35:48 GMT
- Title: Improving Factual Error Correction for Abstractive Summarization via
Data Distillation and Conditional-generation Cloze
- Authors: Yiyang Li and Lei Li and Dingxin Hu and Xueyi Hao and Marina Litvak
and Natalia Vanetik and Yanquan Zhou
- Abstract summary: We first propose a novel factual error correction model FactCloze based on a conditional-generation cloze task.
We then propose a data distillation method to generate a more faithful summarization dataset SummDSC.
- Score: 11.589564922148913
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Improving factual consistency in abstractive summarization has been a focus
of current research. One promising approach is the post-editing method.
However, previous works have yet to make sufficient use of factual factors in
summaries and suffers from the negative effect of the training datasets. In
this paper, we first propose a novel factual error correction model FactCloze
based on a conditional-generation cloze task. FactCloze can construct the
causality among factual factors while being able to determine whether the blank
can be answered or not. Then, we propose a data distillation method to generate
a more faithful summarization dataset SummDSC via multiple-dimensional
evaluation. We experimentally validate the effectiveness of our approach, which
leads to an improvement in multiple factual consistency metrics compared to
baselines.
Related papers
- AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation [57.8363998797433]
We propose AMRFact, a framework that generates perturbed summaries using Abstract Meaning Representations (AMRs)
Our approach parses factually consistent summaries into AMR graphs and injects controlled factual inconsistencies to create negative examples, allowing for coherent factually inconsistent summaries to be generated with high error-type coverage.
arXiv Detail & Related papers (2023-11-16T02:56:29Z) - Questioning the Validity of Summarization Datasets and Improving Their
Factual Consistency [14.974996886744083]
We release SummFC, a filtered summarization dataset with improved factual consistency.
We argue that our dataset should become a valid benchmark for developing and evaluating summarization systems.
arXiv Detail & Related papers (2022-10-31T15:04:20Z) - Correcting Diverse Factual Errors in Abstractive Summarization via
Post-Editing and Language Model Infilling [56.70682379371534]
We show that our approach vastly outperforms prior methods in correcting erroneous summaries.
Our model -- FactEdit -- improves factuality scores by over 11 points on CNN/DM and over 31 points on XSum.
arXiv Detail & Related papers (2022-10-22T07:16:19Z) - FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for
Abstractive Summarization [91.46015013816083]
We present FactPEG, an abstractive summarization model that addresses the problem of factuality during pre-training and fine-tuning.
Our analysis suggests FactPEG is more factual than using the original pre-training objective in zero-shot and fewshot settings.
arXiv Detail & Related papers (2022-05-16T17:39:14Z) - Factual Error Correction for Abstractive Summaries Using Entity
Retrieval [57.01193722520597]
We propose an efficient factual error correction system RFEC based on entities retrieval post-editing process.
RFEC retrieves the evidence sentences from the original document by comparing the sentences with the target summary.
Next, RFEC detects the entity-level errors in the summaries by considering the evidence sentences and substitutes the wrong entities with the accurate entities from the evidence sentences.
arXiv Detail & Related papers (2022-04-18T11:35:02Z) - Factual Consistency Evaluation for Text Summarization via Counterfactual
Estimation [42.63902468258758]
We propose a novel metric to evaluate the factual consistency in text summarization via counterfactual estimation.
We conduct a series of experiments on three public abstractive text summarization datasets.
arXiv Detail & Related papers (2021-08-30T11:48:41Z) - Improving Factual Consistency of Abstractive Summarization via Question
Answering [25.725873545789046]
We present an approach to address factual consistency in summarization.
We first propose an efficient automatic evaluation metric to measure factual consistency.
We then propose a novel learning algorithm that maximizes the proposed metric during model training.
arXiv Detail & Related papers (2021-05-10T19:07:21Z) - Decomposed Adversarial Learned Inference [118.27187231452852]
We propose a novel approach, Decomposed Adversarial Learned Inference (DALI)
DALI explicitly matches prior and conditional distributions in both data and code spaces.
We validate the effectiveness of DALI on the MNIST, CIFAR-10, and CelebA datasets.
arXiv Detail & Related papers (2020-04-21T20:00:35Z) - Enhancing Factual Consistency of Abstractive Summarization [57.67609672082137]
We propose a fact-aware summarization model FASum to extract and integrate factual relations into the summary generation process.
We then design a factual corrector model FC to automatically correct factual errors from summaries generated by existing systems.
arXiv Detail & Related papers (2020-03-19T07:36:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.