Relevance-guided Supervision for OpenQA with ColBERT
- URL: http://arxiv.org/abs/2007.00814v2
- Date: Mon, 2 Aug 2021 17:14:01 GMT
- Title: Relevance-guided Supervision for OpenQA with ColBERT
- Authors: Omar Khattab, Christopher Potts, Matei Zaharia
- Abstract summary: ColBERT-QA adapts the scalable neural retrieval model ColBERT to OpenQA.
ColBERT creates fine-grained interactions between questions and passages.
This greatly improves OpenQA retrieval on Natural Questions, SQuAD, and TriviaQA.
- Score: 27.599190047511033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Systems for Open-Domain Question Answering (OpenQA) generally depend on a
retriever for finding candidate passages in a large corpus and a reader for
extracting answers from those passages. In much recent work, the retriever is a
learned component that uses coarse-grained vector representations of questions
and passages. We argue that this modeling choice is insufficiently expressive
for dealing with the complexity of natural language questions. To address this,
we define ColBERT-QA, which adapts the scalable neural retrieval model ColBERT
to OpenQA. ColBERT creates fine-grained interactions between questions and
passages. We propose an efficient weak supervision strategy that iteratively
uses ColBERT to create its own training data. This greatly improves OpenQA
retrieval on Natural Questions, SQuAD, and TriviaQA, and the resulting system
attains state-of-the-art extractive OpenQA performance on all three datasets.
Related papers
- GSQA: An End-to-End Model for Generative Spoken Question Answering [54.418723701886115]
We introduce the first end-to-end Generative Spoken Question Answering (GSQA) model that empowers the system to engage in abstractive reasoning.
Our model surpasses the previous extractive model by 3% on extractive QA datasets.
Our GSQA model shows the potential to generalize to a broad spectrum of questions, thus further expanding the spoken question answering capabilities of abstractive QA.
arXiv Detail & Related papers (2023-12-15T13:33:18Z) - Open-Set Knowledge-Based Visual Question Answering with Inference Paths [79.55742631375063]
The purpose of Knowledge-Based Visual Question Answering (KB-VQA) is to provide a correct answer to the question with the aid of external knowledge bases.
We propose a new retriever-ranker paradigm of KB-VQA, Graph pATH rankER (GATHER for brevity)
Specifically, it contains graph constructing, pruning, and path-level ranking, which not only retrieves accurate answers but also provides inference paths that explain the reasoning process.
arXiv Detail & Related papers (2023-10-12T09:12:50Z) - Multifaceted Improvements for Conversational Open-Domain Question
Answering [54.913313912927045]
We propose a framework with Multifaceted Improvements for Conversational open-domain Question Answering (MICQA)
Firstly, the proposed KL-divergence based regularization is able to lead to a better question understanding for retrieval and answer reading.
Second, the added post-ranker module can push more relevant passages to the top placements and be selected for reader with a two-aspect constrains.
Third, the well designed curriculum learning strategy effectively narrows the gap between the golden passage settings of training and inference, and encourages the reader to find true answer without the golden passage assistance.
arXiv Detail & Related papers (2022-04-01T07:54:27Z) - Relation-Guided Pre-Training for Open-Domain Question Answering [67.86958978322188]
We propose a Relation-Guided Pre-Training (RGPT-QA) framework to solve complex open-domain questions.
We show that RGPT-QA achieves 2.2%, 2.4%, and 6.3% absolute improvement in Exact Match accuracy on Natural Questions, TriviaQA, and WebQuestions.
arXiv Detail & Related papers (2021-09-21T17:59:31Z) - When Retriever-Reader Meets Scenario-Based Multiple-Choice Questions [15.528174963480614]
We propose a joint retriever-reader model called QAVES where the retriever is implicitly supervised only using relevance labels via a novel word weighting mechanism.
QAVES significantly outperforms a variety of strong baselines on multiple-choice questions in three SQA datasets.
arXiv Detail & Related papers (2021-08-31T14:32:04Z) - UNIQORN: Unified Question Answering over RDF Knowledge Graphs and Natural Language Text [20.1784368017206]
Question answering over RDF data like knowledge graphs has been greatly advanced.
IR and NLP communities have addressed QA over text, but such systems barely utilize semantic data and knowledge.
This paper presents a method for complex questions that can seamlessly operate over a mixture of RDF datasets and text corpora.
arXiv Detail & Related papers (2021-08-19T10:50:52Z) - ComQA:Compositional Question Answering via Hierarchical Graph Neural
Networks [47.12013005600986]
We present a large-scale compositional question answering dataset containing more than 120k human-labeled questions.
To tackle the ComQA problem, we proposed a hierarchical graph neural networks, which represents the document from the low-level word to the high-level sentence.
Our proposed model achieves a significant improvement over previous machine reading comprehension methods and pre-training methods.
arXiv Detail & Related papers (2021-01-16T08:23:27Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z) - Fluent Response Generation for Conversational Question Answering [15.826109118064716]
We propose a method for situating responses within a SEQ2SEQ NLG approach to generate fluent grammatical answer responses.
We use data augmentation to generate training data for an end-to-end system.
arXiv Detail & Related papers (2020-05-21T04:57:01Z) - Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA)
First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA)
Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.