Leveraging Passage Retrieval with Generative Models for Open Domain
Question Answering
- URL: http://arxiv.org/abs/2007.01282v2
- Date: Wed, 3 Feb 2021 09:18:34 GMT
- Title: Leveraging Passage Retrieval with Generative Models for Open Domain
Question Answering
- Authors: Gautier Izacard and Edouard Grave
- Abstract summary: Generative models for open domain question answering have proven to be competitive, without resorting to external knowledge.
We investigate how much these models can benefit from retrieving text passages, potentially containing evidence.
We observe that the performance of this method significantly improves when increasing the number of retrieved passages.
- Score: 61.394478670089065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative models for open domain question answering have proven to be
competitive, without resorting to external knowledge. While promising, this
approach requires to use models with billions of parameters, which are
expensive to train and query. In this paper, we investigate how much these
models can benefit from retrieving text passages, potentially containing
evidence. We obtain state-of-the-art results on the Natural Questions and
TriviaQA open benchmarks. Interestingly, we observe that the performance of
this method significantly improves when increasing the number of retrieved
passages. This is evidence that generative models are good at aggregating and
combining evidence from multiple passages.
Related papers
- Exploring Hint Generation Approaches in Open-Domain Question Answering [16.434748534272014]
We introduce a novel context preparation approach called HINTQA.
Unlike traditional methods, HINTQA prompts LLMs to produce hints about potential answers for the question.
We demonstrate that hints enhance the accuracy of answers more than retrieved and generated contexts.
arXiv Detail & Related papers (2024-09-24T13:50:32Z) - It's All Relative! -- A Synthetic Query Generation Approach for
Improving Zero-Shot Relevance Prediction [19.881193965130173]
Large language models (LLMs) have shown promise in their ability to generate synthetic query-document pairs by prompting with as few as 8 demonstrations.
We propose to reduce this burden by generating queries simultaneously for different labels.
arXiv Detail & Related papers (2023-11-14T06:16:49Z) - An Empirical Comparison of LM-based Question and Answer Generation
Methods [79.31199020420827]
Question and answer generation (QAG) consists of generating a set of question-answer pairs given a context.
In this paper, we establish baselines with three different QAG methodologies that leverage sequence-to-sequence language model (LM) fine-tuning.
Experiments show that an end-to-end QAG model, which is computationally light at both training and inference times, is generally robust and outperforms other more convoluted approaches.
arXiv Detail & Related papers (2023-05-26T14:59:53Z) - AugTriever: Unsupervised Dense Retrieval and Domain Adaptation by Scalable Data Augmentation [44.93777271276723]
We propose two approaches that enable annotation-free and scalable training by creating pseudo querydocument pairs.
The query extraction method involves selecting salient spans from the original document to generate pseudo queries.
The transferred query generation method utilizes generation models trained for other NLP tasks, such as summarization, to produce pseudo queries.
arXiv Detail & Related papers (2022-12-17T10:43:25Z) - Improving Passage Retrieval with Zero-Shot Question Generation [109.11542468380331]
We propose a simple and effective re-ranking method for improving passage retrieval in open question answering.
The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage.
arXiv Detail & Related papers (2022-04-15T14:51:41Z) - You Only Need One Model for Open-domain Question Answering [26.582284346491686]
Recent works for Open-domain Question Answering refer to an external knowledge base using a retriever model.
We propose casting the retriever and the reranker as hard-attention mechanisms applied sequentially within the transformer architecture.
We evaluate our model on Natural Questions and TriviaQA open datasets and our model outperforms the previous state-of-the-art model by 1.0 and 0.7 exact match scores.
arXiv Detail & Related papers (2021-12-14T13:21:11Z) - Attention-guided Generative Models for Extractive Question Answering [17.476450946279037]
Recently, pretrained generative sequence-to-sequence (seq2seq) models have achieved great success in question answering.
We propose a simple strategy to obtain an extractive answer span from the generative model by leveraging the decoder cross-attention patterns.
arXiv Detail & Related papers (2021-10-12T23:02:35Z) - UnitedQA: A Hybrid Approach for Open Domain Question Answering [70.54286377610953]
We apply novel techniques to enhance both extractive and generative readers built upon recent pretrained neural language models.
Our approach outperforms previous state-of-the-art models by 3.3 and 2.7 points in exact match on NaturalQuestions and TriviaQA respectively.
arXiv Detail & Related papers (2021-01-01T06:36:16Z) - Tradeoffs in Sentence Selection Techniques for Open-Domain Question
Answering [54.541952928070344]
We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question.
We show that very lightweight QA models can do well at this task, but retrieval-based models are faster still.
arXiv Detail & Related papers (2020-09-18T23:39:15Z) - Generation-Augmented Retrieval for Open-domain Question Answering [134.27768711201202]
Generation-Augmented Retrieval (GAR) for answering open-domain questions.
We show that generating diverse contexts for a query is beneficial as fusing their results consistently yields better retrieval accuracy.
GAR achieves state-of-the-art performance on Natural Questions and TriviaQA datasets under the extractive QA setup when equipped with an extractive reader.
arXiv Detail & Related papers (2020-09-17T23:08:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.