Asking Questions the Human Way: Scalable Question-Answer Generation from
Text Corpus
- URL: http://arxiv.org/abs/2002.00748v2
- Date: Thu, 5 Mar 2020 01:06:21 GMT
- Title: Asking Questions the Human Way: Scalable Question-Answer Generation from
Text Corpus
- Authors: Bang Liu, Haojie Wei, Di Niu, Haolan Chen, Yancheng He
- Abstract summary: We propose Answer-Clue-Style-aware Question Generation (ACS-QG)
It aims at automatically generating high-quality and diverse question-answer pairs from unlabeled text corpus at scale.
We can generate 2.8 million quality-assured question-answer pairs from a million sentences found in Wikipedia.
- Score: 23.676748207014903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to ask questions is important in both human and machine
intelligence. Learning to ask questions helps knowledge acquisition, improves
question-answering and machine reading comprehension tasks, and helps a chatbot
to keep the conversation flowing with a human. Existing question generation
models are ineffective at generating a large amount of high-quality
question-answer pairs from unstructured text, since given an answer and an
input passage, question generation is inherently a one-to-many mapping. In this
paper, we propose Answer-Clue-Style-aware Question Generation (ACS-QG), which
aims at automatically generating high-quality and diverse question-answer pairs
from unlabeled text corpus at scale by imitating the way a human asks
questions. Our system consists of: i) an information extractor, which samples
from the text multiple types of assistive information to guide question
generation; ii) neural question generators, which generate diverse and
controllable questions, leveraging the extracted assistive information; and
iii) a neural quality controller, which removes low-quality generated data
based on text entailment. We compare our question generation models with
existing approaches and resort to voluntary human evaluation to assess the
quality of the generated question-answer pairs. The evaluation results suggest
that our system dramatically outperforms state-of-the-art neural question
generation models in terms of the generation quality, while being scalable in
the meantime. With models trained on a relatively smaller amount of data, we
can generate 2.8 million quality-assured question-answer pairs from a million
sentences found in Wikipedia.
Related papers
- Weakly Supervised Visual Question Answer Generation [2.7605547688813172]
We present a weakly supervised method that synthetically generates question-answer pairs procedurally from visual information and captions.
We perform an exhaustive experimental analysis on VQA dataset and see that our model significantly outperforms SOTA methods on BLEU scores.
arXiv Detail & Related papers (2023-06-11T08:46:42Z) - Educational Question Generation of Children Storybooks via Question Type Distribution Learning and Event-Centric Summarization [67.1483219601714]
We propose a novel question generation method that first learns the question type distribution of an input story paragraph.
We finetune a pre-trained transformer-based sequence-to-sequence model using silver samples composed by educational question-answer pairs.
Our work indicates the necessity of decomposing question type distribution learning and event-centric summary generation for educational question generation.
arXiv Detail & Related papers (2022-03-27T02:21:19Z) - MixQG: Neural Question Generation with Mixed Answer Types [54.23205265351248]
We propose a neural question generator, MixQG, to bridge this gap.
We combine 9 question answering datasets with diverse answer types, including yes/no, multiple-choice, extractive, and abstractive answers.
Our model outperforms existing work in both seen and unseen domains.
arXiv Detail & Related papers (2021-10-15T16:03:40Z) - Controllable Open-ended Question Generation with A New Question Type
Ontology [6.017006996402699]
We investigate the less-explored task of generating open-ended questions that are typically answered by multiple sentences.
We first define a new question type ontology which differentiates the nuanced nature of questions better than widely used question words.
We then propose a novel question type-aware question generation framework, augmented by a semantic graph representation.
arXiv Detail & Related papers (2021-07-01T00:02:03Z) - Enhancing Question Generation with Commonsense Knowledge [33.289599417096206]
We propose a multi-task learning framework to introduce commonsense knowledge into question generation process.
Experimental results on SQuAD show that our proposed methods are able to noticeably improve the QG performance on both automatic and human evaluation metrics.
arXiv Detail & Related papers (2021-06-19T08:58:13Z) - Understanding Unnatural Questions Improves Reasoning over Text [54.235828149899625]
Complex question answering (CQA) over raw text is a challenging task.
Learning an effective CQA model requires large amounts of human-annotated data.
We address the challenge of learning a high-quality programmer (parser) by projecting natural human-generated questions into unnatural machine-generated questions.
arXiv Detail & Related papers (2020-10-19T10:22:16Z) - Decoding Methods for Neural Narrative Generation [74.37264021226308]
Narrative generation is an open-ended NLP task in which a model generates a story given a prompt.
We apply and evaluate advances in decoding methods for neural response generation to neural narrative generation.
arXiv Detail & Related papers (2020-10-14T19:32:56Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z) - Reinforced Multi-task Approach for Multi-hop Question Generation [47.15108724294234]
We take up Multi-hop question generation, which aims at generating relevant questions based on supporting facts in the context.
We employ multitask learning with the auxiliary task of answer-aware supporting fact prediction to guide the question generator.
We demonstrate the effectiveness of our approach through experiments on the multi-hop question answering dataset, HotPotQA.
arXiv Detail & Related papers (2020-04-05T10:16:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.