An Empirical Comparison of LM-based Question and Answer Generation
Methods
- URL: http://arxiv.org/abs/2305.17002v1
- Date: Fri, 26 May 2023 14:59:53 GMT
- Title: An Empirical Comparison of LM-based Question and Answer Generation
Methods
- Authors: Asahi Ushio and Fernando Alva-Manchego and Jose Camacho-Collados
- Abstract summary: Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
- Score: 79.31199020420827
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Question and answer generation (QAG) consists of generating a set of
question-answer pairs given a context (e.g. a paragraph). This task has a
variety of applications, such as data augmentation for question answering (QA)
models, information retrieval and education. In this paper, we establish
baselines with three different QAG methodologies that leverage
sequence-to-sequence language model (LM) fine-tuning. Experiments show that an
end-to-end QAG model, which is computationally light at both training and
inference times, is generally robust and outperforms other more convoluted
approaches. However, there are differences depending on the underlying
generative LM. Finally, our analysis shows that QA models fine-tuned solely on
generated question-answer pairs can be competitive when compared to supervised
QA models trained on human-labeled data.
Related papers
- QOG:Question and Options Generation based on Language Model [0.3626013617212667]
Question-Options Generation (QOG) is a task that involves generating a set of question-options pairs given context.
We develop QOG models using three different methods based on fine-tuning sequence-to-sequence language models (LMs)
arXiv Detail & Related papers (2024-06-18T08:09:58Z) - A Lightweight Method to Generate Unanswerable Questions in English [18.323248259867356]
We examine a simpler data augmentation method for unanswerable question generation in English.
We perform antonym and entity swaps on answerable questions.
Compared to the prior state-of-the-art, data generated with our training-free and lightweight strategy results in better models.
arXiv Detail & Related papers (2023-10-30T10:14:52Z) - Learning Answer Generation using Supervision from Automatic Question
Answering Evaluators [98.9267570170737]
We propose a novel training paradigm for GenQA using supervision from automatic QA evaluation models (GAVA)
We evaluate our proposed methods on two academic and one industrial dataset, obtaining a significant improvement in answering accuracy over the previous state of the art.
arXiv Detail & Related papers (2023-05-24T16:57:04Z) - RoMQA: A Benchmark for Robust, Multi-evidence, Multi-answer Question
Answering [87.18962441714976]
We introduce RoMQA, the first benchmark for robust, multi-evidence, multi-answer question answering (QA)
We evaluate state-of-the-art large language models in zero-shot, few-shot, and fine-tuning settings, and find that RoMQA is challenging.
Our results show that RoMQA is a challenging benchmark for large language models, and provides a quantifiable test to build more robust QA methods.
arXiv Detail & Related papers (2022-10-25T21:39:36Z) - Improving Unsupervised Question Answering via Summarization-Informed
Question Generation [47.96911338198302]
Question Generation (QG) is the task of generating a plausible question for a passage, answer> pair.
We make use of freely available news summary data, transforming declarative sentences into appropriate questions using dependency parsing, named entity recognition and semantic role labeling.
The resulting questions are then combined with the original news articles to train an end-to-end neural QG model.
arXiv Detail & Related papers (2021-09-16T13:08:43Z) - Generating Diverse and Consistent QA pairs from Contexts with
Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z) - Template-Based Question Generation from Retrieved Sentences for Improved
Unsupervised Question Answering [98.48363619128108]
We propose an unsupervised approach to training QA models with generated pseudo-training data.
We show that generating questions for QA training by applying a simple template on a related, retrieved sentence rather than the original context sentence improves downstream QA performance.
arXiv Detail & Related papers (2020-04-24T17:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.