Multi-Type Conversational Question-Answer Generation with Closed-ended
and Unanswerable Questions
- URL: http://arxiv.org/abs/2210.12979v1
- Date: Mon, 24 Oct 2022 07:01:51 GMT
- Title: Multi-Type Conversational Question-Answer Generation with Closed-ended
and Unanswerable Questions
- Authors: Seonjeong Hwang, Yunsu Kim, Gary Geunbae Lee
- Abstract summary: Conversational question answering (CQA) facilitates an incremental and interactive understanding of a given context.
We introduce a novel method to synthesize data for CQA with various question types, including open-ended, closed-ended, and unanswerable questions.
Across four domains, CQA systems trained on our synthetic data indeed show good performance close to the systems trained on human-annotated data.
- Score: 3.6825890616838066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conversational question answering (CQA) facilitates an incremental and
interactive understanding of a given context, but building a CQA system is
difficult for many domains due to the problem of data scarcity. In this paper,
we introduce a novel method to synthesize data for CQA with various question
types, including open-ended, closed-ended, and unanswerable questions. We
design a different generation flow for each question type and effectively
combine them in a single, shared framework. Moreover, we devise a hierarchical
answerability classification (hierarchical AC) module that improves quality of
the synthetic data while acquiring unanswerable questions. Manual inspections
show that synthetic data generated with our framework have characteristics very
similar to those of human-generated conversations. Across four domains, CQA
systems trained on our synthetic data indeed show good performance close to the
systems trained on human-annotated data.
Related papers
- Unsupervised multiple choices question answering via universal corpus [27.78825771434918]
We propose a novel framework designed to generate synthetic multiple choice question answering (MCQA) data.
We leverage both named entities (NE) and knowledge graphs to discover plausible distractors to form complete synthetic samples.
arXiv Detail & Related papers (2024-02-27T09:10:28Z) - Diversity Enhanced Narrative Question Generation for Storybooks [4.043005183192124]
We introduce a multi-question generation model (mQG) capable of generating multiple, diverse, and answerable questions.
To validate the answerability of the generated questions, we employ a SQuAD2.0 fine-tuned question answering model.
mQG shows promising results across various evaluation metrics, among strong baselines.
arXiv Detail & Related papers (2023-10-25T08:10:04Z) - QADYNAMICS: Training Dynamics-Driven Synthetic QA Diagnostic for
Zero-Shot Commonsense Question Answering [48.25449258017601]
State-of-the-art approaches fine-tune language models on QA pairs constructed from CommonSense Knowledge Bases.
We propose QADYNAMICS, a training dynamics-driven framework for QA diagnostics and refinement.
arXiv Detail & Related papers (2023-10-17T14:27:34Z) - An Empirical Comparison of LM-based Question and Answer Generation
Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z) - Improving Question Answering with Generation of NQ-like Questions [12.276281998447079]
Question Answering (QA) systems require a large amount of annotated data which is costly and time-consuming to gather.
We propose an algorithm to automatically generate shorter questions resembling day-to-day human communication in the Natural Questions (NQ) dataset from longer trivia questions in Quizbowl (QB) dataset.
arXiv Detail & Related papers (2022-10-12T21:36:20Z) - Improving Unsupervised Question Answering via Summarization-Informed
Question Generation [47.96911338198302]
Question Generation (QG) is the task of generating a plausible question for a passage, answer> pair.
We make use of freely available news summary data, transforming declarative sentences into appropriate questions using dependency parsing, named entity recognition and semantic role labeling.
The resulting questions are then combined with the original news articles to train an end-to-end neural QG model.
arXiv Detail & Related papers (2021-09-16T13:08:43Z) - GTM: A Generative Triple-Wise Model for Conversational Question
Generation [36.33685095934868]
We propose a generative triple-wise model with hierarchical variations for open-domain conversational question generation (CQG)
Our method significantly improves the quality of questions in terms of fluency, coherence and diversity over competitive baselines.
arXiv Detail & Related papers (2021-06-07T14:07:07Z) - QAConv: Question Answering on Informative Conversations [85.2923607672282]
We focus on informative conversations including business emails, panel discussions, and work channels.
In total, we collect 34,204 QA pairs, including span-based, free-form, and unanswerable questions.
arXiv Detail & Related papers (2021-05-14T15:53:05Z) - Towards Data Distillation for End-to-end Spoken Conversational Question
Answering [65.124088336738]
We propose a new Spoken Conversational Question Answering task (SCQA)
SCQA aims at enabling QA systems to model complex dialogues flow given the speech utterances and text corpora.
Our main objective is to build a QA system to deal with conversational questions both in spoken and text forms.
arXiv Detail & Related papers (2020-10-18T05:53:39Z) - Generating Diverse and Consistent QA pairs from Contexts with
Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.