TopiOCQA: Open-domain Conversational Question Answeringwith Topic
Switching
- URL: http://arxiv.org/abs/2110.00768v1
- Date: Sat, 2 Oct 2021 09:53:48 GMT
- Title: TopiOCQA: Open-domain Conversational Question Answeringwith Topic
Switching
- Authors: Vaibhav Adlakha, Shehzaad Dhuliawala, Kaheer Suleman, Harm de Vries,
Siva Reddy
- Abstract summary: We introduce TopiOCQA, an open-domain conversational dataset with topic switches on Wikipedia.
TopiOCQA contains 3,920 conversations with information-seeking questions and free-form answers.
We evaluate several baselines, by combining state-of-the-art document retrieval methods with neural reader models.
- Score: 11.717296856448566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In a conversational question answering scenario, a questioner seeks to
extract information about a topic through a series of interdependent questions
and answers. As the conversation progresses, they may switch to related topics,
a phenomenon commonly observed in information-seeking search sessions. However,
current datasets for conversational question answering are limiting in two
ways: 1) they do not contain topic switches; and 2) they assume the reference
text for the conversation is given, i.e., the setting is not open-domain. We
introduce TopiOCQA (pronounced Tapioca), an open-domain conversational dataset
with topic switches on Wikipedia. TopiOCQA contains 3,920 conversations with
information-seeking questions and free-form answers. TopiOCQA poses a
challenging test-bed for models, where efficient retrieval is required on
multiple turns of the same conversation, in conjunction with constructing valid
responses using conversational history. We evaluate several baselines, by
combining state-of-the-art document retrieval methods with neural reader
models. Our best models achieves F1 of 51.9, and BLEU score of 42.1 which falls
short of human performance by 18.3 points and 17.6 points respectively,
indicating the difficulty of our dataset. Our dataset and code will be
available at https://mcgill-nlp.github.io/topiocqa
Related papers
- PCoQA: Persian Conversational Question Answering Dataset [12.07607688189035]
The PCoQA dataset is a resource comprising information-seeking dialogs encompassing a total of 9,026 contextually-driven questions.
PCoQA is designed to present novel challenges compared to previous question answering datasets.
This paper not only presents the comprehensive PCoQA dataset but also reports the performance of various benchmark models.
arXiv Detail & Related papers (2023-12-07T15:29:34Z) - Conversational QA Dataset Generation with Answer Revision [2.5838973036257458]
We introduce a novel framework that extracts question-worthy phrases from a passage and then generates corresponding questions considering previous conversations.
Our framework revises the extracted answers after generating questions so that answers exactly match paired questions.
arXiv Detail & Related papers (2022-09-23T04:05:38Z) - An Answer Verbalization Dataset for Conversational Question Answerings
over Knowledge Graphs [9.979689965471428]
This paper contributes to the state-of-the-art by extending an existing ConvQA dataset with verbalized answers.
We perform experiments with five sequence-to-sequence models on generating answer responses while maintaining grammatical correctness.
arXiv Detail & Related papers (2022-08-13T21:21:28Z) - Multifaceted Improvements for Conversational Open-Domain Question
Answering [54.913313912927045]
We propose a framework with Multifaceted Improvements for Conversational open-domain Question Answering (MICQA)
Firstly, the proposed KL-divergence based regularization is able to lead to a better question understanding for retrieval and answer reading.
Second, the added post-ranker module can push more relevant passages to the top placements and be selected for reader with a two-aspect constrains.
Third, the well designed curriculum learning strategy effectively narrows the gap between the golden passage settings of training and inference, and encourages the reader to find true answer without the golden passage assistance.
arXiv Detail & Related papers (2022-04-01T07:54:27Z) - ConditionalQA: A Complex Reading Comprehension Dataset with Conditional
Answers [93.55268936974971]
We describe a Question Answering dataset that contains complex questions with conditional answers.
We call this dataset ConditionalQA.
We show that ConditionalQA is challenging for many of the existing QA models, especially in selecting answer conditions.
arXiv Detail & Related papers (2021-10-13T17:16:46Z) - QAConv: Question Answering on Informative Conversations [85.2923607672282]
We focus on informative conversations including business emails, panel discussions, and work channels.
In total, we collect 34,204 QA pairs, including span-based, free-form, and unanswerable questions.
arXiv Detail & Related papers (2021-05-14T15:53:05Z) - ParaQA: A Question Answering Dataset with Paraphrase Responses for
Single-Turn Conversation [5.087932295628364]
ParaQA is a dataset with multiple paraphrased responses for single-turn conversation over knowledge graphs (KG)
The dataset was created using a semi-automated framework for generating diverse paraphrasing of the answers using techniques such as back-translation.
arXiv Detail & Related papers (2021-03-13T18:53:07Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z) - Towards Data Distillation for End-to-end Spoken Conversational Question
Answering [65.124088336738]
We propose a new Spoken Conversational Question Answering task (SCQA)
SCQA aims at enabling QA systems to model complex dialogues flow given the speech utterances and text corpora.
Our main objective is to build a QA system to deal with conversational questions both in spoken and text forms.
arXiv Detail & Related papers (2020-10-18T05:53:39Z) - Inquisitive Question Generation for High Level Text Comprehension [60.21497846332531]
We introduce INQUISITIVE, a dataset of 19K questions that are elicited while a person is reading through a document.
We show that readers engage in a series of pragmatic strategies to seek information.
We evaluate question generation models based on GPT-2 and show that our model is able to generate reasonable questions.
arXiv Detail & Related papers (2020-10-04T19:03:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.