Open Domain Question Answering over Tables via Dense Retrieval
- URL: http://arxiv.org/abs/2103.12011v1
- Date: Mon, 22 Mar 2021 17:01:04 GMT
- Title: Open Domain Question Answering over Tables via Dense Retrieval
- Authors: Jonathan Herzig, Thomas M\"uller, Syrine Krichene, Julian Martin
Eisenschlos
- Abstract summary: We present an effective pre-training procedure for our retriever and improve retrieval quality with mined hard negatives.
We find that our retriever improves retrieval results from 72.0 to 81.1@10 and end-to-end QA results from 33.8 to 37.7 exact match, over a BERT based retriever.
- Score: 8.963951462217421
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in open-domain QA have led to strong models based on dense
retrieval, but only focused on retrieving textual passages. In this work, we
tackle open-domain QA over tables for the first time, and show that retrieval
can be improved by a retriever designed to handle tabular context. We present
an effective pre-training procedure for our retriever and improve retrieval
quality with mined hard negatives. As relevant datasets are missing, we extract
a subset of Natural Questions (Kwiatkowski et al., 2019) into a Table QA
dataset. We find that our retriever improves retrieval results from 72.0 to
81.1 recall@10 and end-to-end QA results from 33.8 to 37.7 exact match, over a
BERT based retriever.
Related papers
- BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval [54.54576644403115]
Many complex real-world queries require in-depth reasoning to identify relevant documents.
We introduce BRIGHT, the first text retrieval benchmark that requires intensive reasoning to retrieve relevant documents.
Our dataset consists of 1,384 real-world queries spanning diverse domains, such as economics, psychology, mathematics, and coding.
arXiv Detail & Related papers (2024-07-16T17:58:27Z) - KET-QA: A Dataset for Knowledge Enhanced Table Question Answering [63.56707527868466]
We propose to use a knowledge base (KB) as the external knowledge source for TableQA.
Every question requires the integration of information from both the table and the sub-graph to be answered.
We design a retriever-reasoner structured pipeline model to extract pertinent information from the vast knowledge sub-graph.
arXiv Detail & Related papers (2024-05-13T18:26:32Z) - Denoising Table-Text Retrieval for Open-Domain Question Answering [6.711626456283439]
In table-text open-domain question answering, a retriever system retrieves relevant evidence from tables and text to answer questions.
Previous studies have two common challenges: firstly, their retrievers can be affected by false-positive labels in training datasets.
We propose Denoised Table-Text Retriever (DoTTeR) to overcome these issues.
arXiv Detail & Related papers (2024-03-26T11:44:49Z) - Ask Optimal Questions: Aligning Large Language Models with Retriever's
Preference in Conversational Search [25.16282868262589]
RetPO is designed to optimize a language model (LM) for reformulating search queries in line with the preferences of the target retrieval systems.
We construct a large-scale dataset called Retrievers' Feedback on over 410K query rewrites across 12K conversations.
The resulting model achieves state-of-the-art performance on two recent conversational search benchmarks.
arXiv Detail & Related papers (2024-02-19T04:41:31Z) - MURRE: Multi-Hop Table Retrieval with Removal for Open-Domain Text-to-SQL [51.48239006107272]
Multi-hop table retrieval with removal (MURRE) removes previously retrieved information from the question to guide towards unretrieved relevant tables.
Experiments on two open-domain text-to- retriever datasets demonstrate an average improvement of 5.7% over the previous state-of-the-art results.
arXiv Detail & Related papers (2024-02-16T13:14:35Z) - CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement
Learning [16.470428531658232]
We develop a query rewriting model CONQRR that rewrites a conversational question in context into a standalone question.
We show that CONQRR achieves state-of-the-art results on a recent open-domain CQA dataset.
arXiv Detail & Related papers (2021-12-16T01:40:30Z) - Learning to Retrieve Passages without Supervision [58.31911597824848]
Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive performance by training on large datasets of question-passage pairs.
We investigate whether dense retrievers can be learned in a self-supervised fashion, and applied effectively without any annotations.
arXiv Detail & Related papers (2021-12-14T19:18:08Z) - Relation-Guided Pre-Training for Open-Domain Question Answering [67.86958978322188]
We propose a Relation-Guided Pre-Training (RGPT-QA) framework to solve complex open-domain questions.
We show that RGPT-QA achieves 2.2%, 2.4%, and 6.3% absolute improvement in Exact Match accuracy on Natural Questions, TriviaQA, and WebQuestions.
arXiv Detail & Related papers (2021-09-21T17:59:31Z) - Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question.
Most open QA systems have considered only retrieving information from unstructured text.
We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z) - Relevance-guided Supervision for OpenQA with ColBERT [27.599190047511033]
ColBERT-QA adapts the scalable neural retrieval model ColBERT to OpenQA.
ColBERT creates fine-grained interactions between questions and passages.
This greatly improves OpenQA retrieval on Natural Questions, SQuAD, and TriviaQA.
arXiv Detail & Related papers (2020-07-01T23:50:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.