DoQA -- Accessing Domain-Specific FAQs via Conversational QA
- URL: http://arxiv.org/abs/2005.01328v2
- Date: Mon, 18 May 2020 07:54:27 GMT
- Title: DoQA -- Accessing Domain-Specific FAQs via Conversational QA
- Authors: Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark
Cieliebak, Eneko Agirre
- Abstract summary: We present DoQA, a dataset with 2,437 dialogues and 10,917 QA pairs.
The dialogues are collected from three Stack Exchange sites using the Wizard of Oz method with crowdsourcing.
- Score: 25.37327993590628
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The goal of this work is to build conversational Question Answering (QA)
interfaces for the large body of domain-specific information available in FAQ
sites. We present DoQA, a dataset with 2,437 dialogues and 10,917 QA pairs. The
dialogues are collected from three Stack Exchange sites using the Wizard of Oz
method with crowdsourcing. Compared to previous work, DoQA comprises
well-defined information needs, leading to more coherent and natural
conversations with less factoid questions and is multi-domain. In addition, we
introduce a more realistic information retrieval(IR) scenario where the system
needs to find the answer in any of the FAQ documents. The results of an
existing, strong, system show that, thanks to transfer learning from a
Wikipedia QA dataset and fine tuning on a single FAQ domain, it is possible to
build high quality conversational QA systems for FAQs without in-domain
training data. The good results carry over into the more challenging IR
scenario. In both cases, there is still ample room for improvement, as
indicated by the higher human upperbound.
Related papers
- Conversational Tree Search: A New Hybrid Dialog Task [21.697256733634124]
We introduce Conversational Tree Search (CTS) as a new task that bridges the gap between FAQ-style information retrieval and task-oriented dialog.
Our results show that the new architecture combines the positive aspects of both the FAQ and dialog system used in the baseline and achieves higher goal completion.
arXiv Detail & Related papers (2023-03-17T19:50:51Z) - MFBE: Leveraging Multi-Field Information of FAQs for Efficient Dense
Retrieval [1.7403133838762446]
We propose a bi-encoder-based query-FAQ matching model that leverages multiple combinations of FAQ fields.
Our model achieves around 27% and 20% better top-1 accuracy for the FAQ retrieval task on internal and open datasets.
arXiv Detail & Related papers (2023-02-23T12:02:49Z) - RealTime QA: What's the Answer Right Now? [137.04039209995932]
We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis.
We build strong baseline models upon large pretrained language models, including GPT-3 and T5.
GPT-3 tends to return outdated answers when retrieved documents do not provide sufficient information to find an answer.
arXiv Detail & Related papers (2022-07-27T07:26:01Z) - Multifaceted Improvements for Conversational Open-Domain Question
Answering [54.913313912927045]
We propose a framework with Multifaceted Improvements for Conversational open-domain Question Answering (MICQA)
Firstly, the proposed KL-divergence based regularization is able to lead to a better question understanding for retrieval and answer reading.
Second, the added post-ranker module can push more relevant passages to the top placements and be selected for reader with a two-aspect constrains.
Third, the well designed curriculum learning strategy effectively narrows the gap between the golden passage settings of training and inference, and encourages the reader to find true answer without the golden passage assistance.
arXiv Detail & Related papers (2022-04-01T07:54:27Z) - ConditionalQA: A Complex Reading Comprehension Dataset with Conditional
Answers [93.55268936974971]
We describe a Question Answering dataset that contains complex questions with conditional answers.
We call this dataset ConditionalQA.
We show that ConditionalQA is challenging for many of the existing QA models, especially in selecting answer conditions.
arXiv Detail & Related papers (2021-10-13T17:16:46Z) - Relation-Guided Pre-Training for Open-Domain Question Answering [67.86958978322188]
We propose a Relation-Guided Pre-Training (RGPT-QA) framework to solve complex open-domain questions.
We show that RGPT-QA achieves 2.2%, 2.4%, and 6.3% absolute improvement in Exact Match accuracy on Natural Questions, TriviaQA, and WebQuestions.
arXiv Detail & Related papers (2021-09-21T17:59:31Z) - QAConv: Question Answering on Informative Conversations [85.2923607672282]
We focus on informative conversations including business emails, panel discussions, and work channels.
In total, we collect 34,204 QA pairs, including span-based, free-form, and unanswerable questions.
arXiv Detail & Related papers (2021-05-14T15:53:05Z) - Effective FAQ Retrieval and Question Matching With Unsupervised
Knowledge Injection [10.82418428209551]
We propose a contextual language model for retrieving appropriate answers to frequently asked questions.
We also explore to capitalize on domain-specific topically-relevant relations between words in an unsupervised manner.
We evaluate variants of our approach on a publicly-available Chinese FAQ dataset, and further apply and contextualize it to a large-scale question-matching task.
arXiv Detail & Related papers (2020-10-27T05:03:34Z) - Towards Data Distillation for End-to-end Spoken Conversational Question
Answering [65.124088336738]
We propose a new Spoken Conversational Question Answering task (SCQA)
SCQA aims at enabling QA systems to model complex dialogues flow given the speech utterances and text corpora.
Our main objective is to build a QA system to deal with conversational questions both in spoken and text forms.
arXiv Detail & Related papers (2020-10-18T05:53:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.