End-to-End Training of Multi-Document Reader and Retriever for
Open-Domain Question Answering
- URL: http://arxiv.org/abs/2106.05346v1
- Date: Wed, 9 Jun 2021 19:25:37 GMT
- Title: End-to-End Training of Multi-Document Reader and Retriever for
Open-Domain Question Answering
- Authors: Devendra Singh Sachan and Siva Reddy and William Hamilton and Chris
Dyer and Dani Yogatama
- Abstract summary: We present an end-to-end differentiable training method for retrieval-augmented open-domain question answering systems.
We model retrieval decisions as latent variables over sets of relevant documents.
Our proposed method outperforms all existing approaches of comparable size by 2-3% exact match points.
- Score: 36.80395759543162
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an end-to-end differentiable training method for
retrieval-augmented open-domain question answering systems that combine
information from multiple retrieved documents when generating answers. We model
retrieval decisions as latent variables over sets of relevant documents. Since
marginalizing over sets of retrieved documents is computationally hard, we
approximate this using an expectation-maximization algorithm. We iteratively
estimate the value of our latent variable (the set of relevant documents for a
given question) and then use this estimate to update the retriever and reader
parameters. We hypothesize that such end-to-end training allows training
signals to flow to the reader and then to the retriever better than staged-wise
training. This results in a retriever that is able to select more relevant
documents for a question and a reader that is trained on more accurate
documents to generate an answer. Experiments on three benchmark datasets
demonstrate that our proposed method outperforms all existing approaches of
comparable size by 2-3% absolute exact match points, achieving new
state-of-the-art results. Our results also demonstrate the feasibility of
learning to retrieve to improve answer generation without explicit supervision
of retrieval decisions.
Related papers
- Improve Dense Passage Retrieval with Entailment Tuning [22.39221206192245]
Key to a retrieval system is to calculate relevance scores to query and passage pairs.
We observed that a major class of relevance aligns with the concept of entailment in NLI tasks.
We design a method called entailment tuning to improve the embedding of dense retrievers.
arXiv Detail & Related papers (2024-10-21T09:18:30Z) - Learning to Retrieve Iteratively for In-Context Learning [56.40100968649039]
iterative retrieval is a novel framework that empowers retrievers to make iterative decisions through policy optimization.
We instantiate an iterative retriever for composing in-context learning exemplars and apply it to various semantic parsing tasks.
By adding only 4M additional parameters for state encoding, we convert an off-the-shelf dense retriever into a stateful iterative retriever.
arXiv Detail & Related papers (2024-06-20T21:07:55Z) - Non-Parametric Memory Guidance for Multi-Document Summarization [0.0]
We propose a retriever-guided model combined with non-parametric memory for summary generation.
This model retrieves relevant candidates from a database and then generates the summary considering the candidates with a copy mechanism and the source documents.
Our method is evaluated on the MultiXScience dataset which includes scientific articles.
arXiv Detail & Related papers (2023-11-14T07:41:48Z) - Retrieval as Attention: End-to-end Learning of Retrieval and Reading
within a Single Transformer [80.50327229467993]
We show that a single model trained end-to-end can achieve both competitive retrieval and QA performance.
We show that end-to-end adaptation significantly boosts its performance on out-of-domain datasets in both supervised and unsupervised settings.
arXiv Detail & Related papers (2022-12-05T04:51:21Z) - Incorporating Relevance Feedback for Information-Seeking Retrieval using
Few-Shot Document Re-Ranking [56.80065604034095]
We introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant.
To evaluate our different integration strategies, we transform four existing information retrieval datasets into the relevance feedback scenario.
arXiv Detail & Related papers (2022-10-19T16:19:37Z) - Generate rather than Retrieve: Large Language Models are Strong Context
Generators [74.87021992611672]
We present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators.
We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer.
arXiv Detail & Related papers (2022-09-21T01:30:59Z) - Questions Are All You Need to Train a Dense Passage Retriever [123.13872383489172]
ART is a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data.
It uses a new document-retrieval autoencoding scheme, where (1) an input question is used to retrieve a set of evidence documents, and (2) the documents are then used to compute the probability of reconstructing the original question.
arXiv Detail & Related papers (2022-06-21T18:16:31Z) - CODER: An efficient framework for improving retrieval through
COntextualized Document Embedding Reranking [11.635294568328625]
We present a framework for improving the performance of a wide class of retrieval models at minimal computational cost.
It utilizes precomputed document representations extracted by a base dense retrieval method.
It incurs a negligible computational overhead on top of any first-stage method at run time, allowing it to be easily combined with any state-of-the-art dense retrieval method.
arXiv Detail & Related papers (2021-12-16T10:25:26Z) - Weakly Supervised Pre-Training for Multi-Hop Retriever [23.79574380039197]
We propose a new method for weakly supervised multi-hop retriever pre-training without human efforts.
Our method includes 1) a pre-training task for generating vector representations of complex questions, 2) a scalable data generation method that produces the nested structure of question and sub-question as weak supervision for pre-training, and 3) a pre-training model structure based on dense encoders.
arXiv Detail & Related papers (2021-06-18T08:06:02Z) - Distilling Knowledge from Reader to Retriever for Question Answering [16.942581590186343]
We propose a technique to learn retriever models for downstream tasks, inspired by knowledge distillation.
We evaluate our method on question answering, obtaining state-of-the-art results.
arXiv Detail & Related papers (2020-12-08T17:36:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.