Adversarial NLI for Factual Correctness in Text Summarisation Models
- URL: http://arxiv.org/abs/2005.11739v1
- Date: Sun, 24 May 2020 13:02:57 GMT
- Title: Adversarial NLI for Factual Correctness in Text Summarisation Models
- Authors: Mario Barrantes and Benedikt Herudek and Richard Wang
- Abstract summary: We apply the Adversarial NLI dataset to train the NLI model.
We show that the model has the potential to enhance factual correctness in abstract summarization.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We apply the Adversarial NLI dataset to train the NLI model and show that the
model has the potential to enhance factual correctness in abstract
summarization. We follow the work of Falke et al. (2019), which rank multiple
generated summaries based on the entailment probabilities between an source
document and summaries and select the summary that has the highest entailment
probability. The authors' earlier study concluded that current NLI models are
not sufficiently accurate for the ranking task. We show that the Transformer
models fine-tuned on the new dataset achieve significantly higher accuracy and
have the potential of selecting a coherent summary.
Related papers
- A synthetic data approach for domain generalization of NLI models [13.840374911669167]
Natural Language Inference (NLI) remains an important benchmark task for LLMs.
We show that synthetic high-quality datasets can adapt NLI models for zero-shot use in downstream applications.
We show that models trained on this data have the best generalization to completely new downstream test settings.
arXiv Detail & Related papers (2024-02-19T18:55:16Z) - Calibrating Likelihoods towards Consistency in Summarization Models [22.023863165579602]
We argue that the main reason for such behavior is that the summarization models trained with maximum likelihood objective assign high probability to plausible sequences given the context.
In this work, we solve this problem by calibrating the likelihood of model generated sequences to better align with a consistency metric measured by natural language inference (NLI) models.
arXiv Detail & Related papers (2023-10-12T23:17:56Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - mFACE: Multilingual Summarization with Factual Consistency Evaluation [79.60172087719356]
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
Despite promising results, current models still suffer from generating factually inconsistent summaries.
We leverage factual consistency evaluation models to improve multilingual summarization.
arXiv Detail & Related papers (2022-12-20T19:52:41Z) - Correcting Diverse Factual Errors in Abstractive Summarization via
Post-Editing and Language Model Infilling [56.70682379371534]
We show that our approach vastly outperforms prior methods in correcting erroneous summaries.
Our model -- FactEdit -- improves factuality scores by over 11 points on CNN/DM and over 31 points on XSum.
arXiv Detail & Related papers (2022-10-22T07:16:19Z) - Falsesum: Generating Document-level NLI Examples for Recognizing Factual
Inconsistency in Summarization [63.21819285337555]
We show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples.
We introduce Falsesum, a data generation pipeline leveraging a controllable text generation model to perturb human-annotated summaries.
We show that models trained on a Falsesum-augmented NLI dataset improve the state-of-the-art performance across four benchmarks for detecting factual inconsistency in summarization.
arXiv Detail & Related papers (2022-05-12T10:43:42Z) - MeetSum: Transforming Meeting Transcript Summarization using
Transformers! [2.1915057426589746]
We utilize a Transformer-based Pointer Generator Network to generate abstract summaries for meeting transcripts.
This model uses 2 LSTMs as an encoder and a decoder, a Pointer network which copies words from the inputted text, and a Generator network to produce out-of-vocabulary words.
We show that training the model on a news summary dataset and using zero-shot learning to test it on the meeting dataset proves to produce better results than training it on the AMI meeting dataset.
arXiv Detail & Related papers (2021-08-13T16:34:09Z) - Improving Zero and Few-Shot Abstractive Summarization with Intermediate
Fine-tuning and Data Augmentation [101.26235068460551]
Models pretrained with self-supervised objectives on large text corpora achieve state-of-the-art performance on English text summarization tasks.
Models are typically fine-tuned on hundreds of thousands of data points, an infeasible requirement when applying summarization to new, niche domains.
We introduce a novel and generalizable method, called WikiTransfer, for fine-tuning pretrained models for summarization in an unsupervised, dataset-specific manner.
arXiv Detail & Related papers (2020-10-24T08:36:49Z) - Enhancing Factual Consistency of Abstractive Summarization [57.67609672082137]
We propose a fact-aware summarization model FASum to extract and integrate factual relations into the summary generation process.
We then design a factual corrector model FC to automatically correct factual errors from summaries generated by existing systems.
arXiv Detail & Related papers (2020-03-19T07:36:10Z) - Abstractive Summarization for Low Resource Data using Domain Transfer
and Data Synthesis [1.148539813252112]
We explore using domain transfer and data synthesis to improve the performance of recent abstractive summarization methods.
We show that tuning state of the art model trained on newspaper data could boost performance on student reflection data.
We propose a template-based model to synthesize new data, which when incorporated into training further increased ROUGE scores.
arXiv Detail & Related papers (2020-02-09T17:49:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.