MQAG: Multiple-choice Question Answering and Generation for Assessing
Information Consistency in Summarization
- URL: http://arxiv.org/abs/2301.12307v2
- Date: Thu, 7 Sep 2023 18:20:40 GMT
- Title: MQAG: Multiple-choice Question Answering and Generation for Assessing
Information Consistency in Summarization
- Authors: Potsawee Manakul, Adian Liusie, Mark J. F. Gales
- Abstract summary: State-of-the-art summarization systems can generate highly fluent summaries.
These summaries, however, may contain factual inconsistencies and/or information not present in the source.
We introduce an alternative scheme based on standard information-theoretic measures in which the information present in the source and summary is directly compared.
- Score: 55.60306377044225
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: State-of-the-art summarization systems can generate highly fluent summaries.
These summaries, however, may contain factual inconsistencies and/or
information not present in the source. Hence, an important component of
assessing the quality of summaries is to determine whether there is information
consistency between the source and the summary. Existing approaches are
typically based on lexical matching or representation-based methods. In this
work, we introduce an alternative scheme based on standard
information-theoretic measures in which the information present in the source
and summary is directly compared. We propose a Multiple-choice Question
Answering and Generation framework, MQAG, which approximates the information
consistency by computing the expected statistical distance between summary and
source answer distributions over automatically generated multiple-choice
questions. This approach exploits multiple-choice answer probabilities, as
predicted answer distributions can be compared. We conduct experiments on four
summary evaluation datasets: QAG-CNNDM/XSum, XSum-Hallucination, Podcast
Assessment, and SummEval. Experiments show that MQAG, using models trained on
SQuAD or RACE, outperforms existing evaluation methods on the majority of
tasks.
Related papers
- Aspect-oriented Consumer Health Answer Summarization [2.298110639419913]
Community Question-Answering (CQA) forums have revolutionized how people seek information, especially those related to their healthcare needs.
There can be several answers in response to a single query, which makes it hard to grasp the key information related to the specific health concern.
Our research focuses on aspect-based summarization of health answers to address this limitation.
arXiv Detail & Related papers (2024-05-10T07:52:43Z) - An Empirical Comparison of LM-based Question and Answer Generation
Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z) - BRIO: Bringing Order to Abstractive Summarization [107.97378285293507]
We propose a novel training paradigm which assumes a non-deterministic distribution.
Our method achieves a new state-of-the-art result on the CNN/DailyMail (47.78 ROUGE-1) and XSum (49.07 ROUGE-1) datasets.
arXiv Detail & Related papers (2022-03-31T05:19:38Z) - Abstractive Query Focused Summarization with Query-Free Resources [60.468323530248945]
In this work, we consider the problem of leveraging only generic summarization resources to build an abstractive QFS system.
We propose Marge, a Masked ROUGE Regression framework composed of a novel unified representation for summaries and queries.
Despite learning from minimal supervision, our system achieves state-of-the-art results in the distantly supervised setting.
arXiv Detail & Related papers (2020-12-29T14:39:35Z) - Multi-hop Inference for Question-driven Summarization [39.08269647808958]
We propose a novel question-driven abstractive summarization method, Multi-hop Selective Generator (MSG)
MSG incorporates multi-hop reasoning into question-driven summarization and, meanwhile, provide justifications for the generated summaries.
Experimental results show that the proposed method consistently outperforms state-of-the-art methods on two non-factoid QA datasets.
arXiv Detail & Related papers (2020-10-08T02:36:39Z) - FEQA: A Question Answering Evaluation Framework for Faithfulness
Assessment in Abstractive Summarization [34.2456005415483]
We tackle the problem of evaluating faithfulness of a generated summary given its source document.
We find that current models exhibit a trade-off between abstractiveness and faithfulness.
We propose an automatic question answering (QA) based metric for faithfulness.
arXiv Detail & Related papers (2020-05-07T21:00:08Z) - Asking and Answering Questions to Evaluate the Factual Consistency of
Summaries [80.65186293015135]
We propose an automatic evaluation protocol called QAGS (pronounced "kags") to identify factual inconsistencies in a generated summary.
QAGS is based on the intuition that if we ask questions about a summary and its source, we will receive similar answers if the summary is factually consistent with the source.
We believe QAGS is a promising tool in automatically generating usable and factually consistent text.
arXiv Detail & Related papers (2020-04-08T20:01:09Z) - Query Focused Multi-Document Summarization with Distant Supervision [88.39032981994535]
Existing work relies heavily on retrieval-style methods for estimating the relevance between queries and text segments.
We propose a coarse-to-fine modeling framework which introduces separate modules for estimating whether segments are relevant to the query.
We demonstrate that our framework outperforms strong comparison systems on standard QFS benchmarks.
arXiv Detail & Related papers (2020-04-06T22:35:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.