MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering
Models
- URL: http://arxiv.org/abs/2005.02507v1
- Date: Tue, 5 May 2020 21:30:16 GMT
- Title: MultiReQA: A Cross-Domain Evaluation for Retrieval Question Answering
Models
- Authors: Mandy Guo, Yinfei Yang, Daniel Cer, Qinlan Shen, Noah Constant
- Abstract summary: Retrieval question answering (ReQA) is the task of retrieving a sentence-level answer to a question from an open corpus.
This paper presents MultiReQA, anew multi-domain ReQA evaluation suite com-posed of eight retrieval QA tasks drawn from publicly available QA datasets.
- Score: 25.398047573530985
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval question answering (ReQA) is the task of retrieving a
sentence-level answer to a question from an open corpus (Ahmad et
al.,2019).This paper presents MultiReQA, anew multi-domain ReQA evaluation
suite com-posed of eight retrieval QA tasks drawn from publicly available QA
datasets. We provide the first systematic retrieval based evaluation over these
datasets using two supervised neural models, based on fine-tuning BERT
andUSE-QA models respectively, as well as a surprisingly strong information
retrieval baseline,BM25. Five of these tasks contain both train-ing and test
data, while three contain test data only. Performance on the five tasks with
train-ing data shows that while a general model covering all domains is
achievable, the best performance is often obtained by training exclusively on
in-domain data.
Related papers
- QUADRo: Dataset and Models for QUestion-Answer Database Retrieval [97.84448420852854]
Given a database (DB) of question/answer (q/a) pairs, it is possible to answer a target question by scanning the DB for similar questions.
We build a large scale DB of 6.3M q/a pairs, using public questions, and design a new system based on neural IR and a q/a pair reranker.
We show that our DB-based approach is competitive with Web-based methods, i.e., a QA system built on top the BING search engine.
arXiv Detail & Related papers (2023-03-30T00:42:07Z) - CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training [21.07506671340319]
We propose a novel question-answering dataset based on the Common Crawl project in this paper.
We extract around 130 million multilingual question-answer pairs, including about 60 million English data-points.
With this previously unseen number of natural QA pairs, we pre-train popular language models to show the potential of large-scale in-domain pre-training for the task of question-answering.
arXiv Detail & Related papers (2021-10-14T21:23:01Z) - Learning to Rank Question Answer Pairs with Bilateral Contrastive Data
Augmentation [39.22166065525888]
We propose a novel and easy-to-apply data augmentation strategy, namely Bilateral Generation (BiG)
With the augmented dataset, we design a contrastive training objective for learning to rank question answer pairs.
Experimental results on three benchmark datasets, namely TREC-QA, WikiQA, and ANTIQUE, show that our method significantly improves the performance of ranking models.
arXiv Detail & Related papers (2021-06-21T13:29:43Z) - Abstractive Query Focused Summarization with Query-Free Resources [60.468323530248945]
In this work, we consider the problem of leveraging only generic summarization resources to build an abstractive QFS system.
We propose Marge, a Masked ROUGE Regression framework composed of a novel unified representation for summaries and queries.
Despite learning from minimal supervision, our system achieves state-of-the-art results in the distantly supervised setting.
arXiv Detail & Related papers (2020-12-29T14:39:35Z) - Tradeoffs in Sentence Selection Techniques for Open-Domain Question
Answering [54.541952928070344]
We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question.
We show that very lightweight QA models can do well at this task, but retrieval-based models are faster still.
arXiv Detail & Related papers (2020-09-18T23:39:15Z) - Generating Diverse and Consistent QA pairs from Contexts with
Information-Maximizing Hierarchical Conditional VAEs [62.71505254770827]
We propose a conditional variational autoencoder (HCVAE) for generating QA pairs given unstructured texts as contexts.
Our model obtains impressive performance gains over all baselines on both tasks, using only a fraction of data for training.
arXiv Detail & Related papers (2020-05-28T08:26:06Z) - Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA)
First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA)
Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z) - Template-Based Question Generation from Retrieved Sentences for Improved
Unsupervised Question Answering [98.48363619128108]
We propose an unsupervised approach to training QA models with generated pseudo-training data.
We show that generating questions for QA training by applying a simple template on a related, retrieved sentence rather than the original context sentence improves downstream QA performance.
arXiv Detail & Related papers (2020-04-24T17:57:45Z) - What do Models Learn from Question Answering Datasets? [2.28438857884398]
We investigate if models are learning reading comprehension from question answering datasets.
We evaluate models on their generalizability to out-of-domain examples, responses to missing or incorrect data, and ability to handle question variations.
We make recommendations for building future QA datasets that better evaluate the task of question answering through reading comprehension.
arXiv Detail & Related papers (2020-04-07T15:41:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.