End-to-End Training of Neural Retrievers for Open-Domain Question
Answering
- URL: http://arxiv.org/abs/2101.00408v1
- Date: Sat, 2 Jan 2021 09:05:34 GMT
- Title: End-to-End Training of Neural Retrievers for Open-Domain Question
Answering
- Authors: Devendra Singh Sachan and Mostofa Patwary and Mohammad Shoeybi and
Neel Kant and Wei Ping and William L Hamilton and Bryan Catanzaro
- Abstract summary: It remains unclear how unsupervised and supervised methods can be used most effectively for neural retrievers.
We propose an approach of unsupervised pre-training with the Inverse Cloze Task and masked salient spans.
We also explore two approaches for end-to-end supervised training of the reader and retriever components in OpenQA models.
- Score: 32.747113232867825
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work on training neural retrievers for open-domain question answering
(OpenQA) has employed both supervised and unsupervised approaches. However, it
remains unclear how unsupervised and supervised methods can be used most
effectively for neural retrievers. In this work, we systematically study
retriever pre-training. We first propose an approach of unsupervised
pre-training with the Inverse Cloze Task and masked salient spans, followed by
supervised finetuning using question-context pairs. This approach leads to
absolute gains of 2+ points over the previous best result in the top-20
retrieval accuracy on Natural Questions and TriviaQA datasets.
We also explore two approaches for end-to-end supervised training of the
reader and retriever components in OpenQA models. In the first approach, the
reader considers each retrieved document separately while in the second
approach, the reader considers all the retrieved documents together. Our
experiments demonstrate the effectiveness of these approaches as we obtain new
state-of-the-art results. On the Natural Questions dataset, we obtain a top-20
retrieval accuracy of 84, an improvement of 5 points over the recent DPR model.
In addition, we achieve good results on answer extraction, outperforming recent
models like REALM and RAG by 3+ points. We further scale up end-to-end training
to large models and show consistent gains in performance over smaller models.
Related papers
- Noisy Self-Training with Synthetic Queries for Dense Retrieval [49.49928764695172]
We introduce a novel noisy self-training framework combined with synthetic queries.
Experimental results show that our method improves consistently over existing methods.
Our method is data efficient and outperforms competitive baselines.
arXiv Detail & Related papers (2023-11-27T06:19:50Z) - Retrieval as Attention: End-to-end Learning of Retrieval and Reading
within a Single Transformer [80.50327229467993]
We show that a single model trained end-to-end can achieve both competitive retrieval and QA performance.
We show that end-to-end adaptation significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings.
arXiv Detail & Related papers (2022-12-05T04:51:21Z) - Neural Retriever and Go Beyond: A Thesis Proposal [1.082365064737981]
Information Retriever (IR) aims to find the relevant documents to a given query at large scale.
Recent neural-based algorithms (termed as neural retrievers) have gained more attention which can mitigate the limitations of traditional methods.
arXiv Detail & Related papers (2022-05-31T17:59:30Z) - Improving Passage Retrieval with Zero-Shot Question Generation [109.11542468380331]
We propose a simple and effective re-ranking method for improving passage retrieval in open question answering.
The re-ranker re-scores retrieved passages with a zero-shot question generation model, which uses a pre-trained language model to compute the probability of the input question conditioned on a retrieved passage.
arXiv Detail & Related papers (2022-04-15T14:51:41Z) - Learning to Retrieve Passages without Supervision [58.31911597824848]
Dense retrievers for open-domain question answering (ODQA) have been shown to achieve impressive performance by training on large datasets of question-passage pairs.
We investigate whether dense retrievers can be learned in a self-supervised fashion, and applied effectively without any annotations.
arXiv Detail & Related papers (2021-12-14T19:18:08Z) - End-to-End Training of Multi-Document Reader and Retriever for
Open-Domain Question Answering [36.80395759543162]
We present an end-to-end differentiable training method for retrieval-augmented open-domain question answering systems.
We model retrieval decisions as latent variables over sets of relevant documents.
Our proposed method outperforms all existing approaches of comparable size by 2-3% exact match points.
arXiv Detail & Related papers (2021-06-09T19:25:37Z) - RewardsOfSum: Exploring Reinforcement Learning Rewards for Summarisation [7.0471949371778795]
We propose two reward functions for the task of abstractive summarisation.
The first function, referred to as RwB-Hinge, dynamically selects the samples for the gradient update.
The second function, nicknamed RISK, leverages a small pool of strong candidates to inform the reward.
arXiv Detail & Related papers (2021-06-08T03:30:50Z) - UnitedQA: A Hybrid Approach for Open Domain Question Answering [70.54286377610953]
We apply novel techniques to enhance both extractive and generative readers built upon recent pretrained neural language models.
Our approach outperforms previous state-of-the-art models by 3.3 and 2.7 points in exact match on NaturalQuestions and TriviaQA respectively.
arXiv Detail & Related papers (2021-01-01T06:36:16Z) - Harvesting and Refining Question-Answer Pairs for Unsupervised QA [95.9105154311491]
We introduce two approaches to improve unsupervised Question Answering (QA)
First, we harvest lexically and syntactically divergent questions from Wikipedia to automatically construct a corpus of question-answer pairs (named as RefQA)
Second, we take advantage of the QA model to extract more appropriate answers, which iteratively refines data over RefQA.
arXiv Detail & Related papers (2020-05-06T15:56:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.