Is Retriever Merely an Approximator of Reader?
- URL: http://arxiv.org/abs/2010.10999v1
- Date: Wed, 21 Oct 2020 13:40:15 GMT
- Title: Is Retriever Merely an Approximator of Reader?
- Authors: Sohee Yang, Minjoon Seo
- Abstract summary: We show that the reader and the retriever are complementary to each other even in terms of accuracy only.
We propose to distill the reader into the retriever so that the retriever absorbs the strength of the reader while keeping its own benefit.
- Score: 27.306407064073177
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The state of the art in open-domain question answering (QA) relies on an
efficient retriever that drastically reduces the search space for the expensive
reader. A rather overlooked question in the community is the relationship
between the retriever and the reader, and in particular, if the whole purpose
of the retriever is just a fast approximation for the reader. Our empirical
evidence indicates that the answer is no, and that the reader and the retriever
are complementary to each other even in terms of accuracy only. We make a
careful conjecture that the architectural constraint of the retriever, which
has been originally intended for enabling approximate search, seems to also
make the model more robust in large-scale search. We then propose to distill
the reader into the retriever so that the retriever absorbs the strength of the
reader while keeping its own benefit. Experimental results show that our method
can enhance the document recall rate as well as the end-to-end QA accuracy of
off-the-shelf retrievers in open-domain QA tasks.
Related papers
- Beyond Relevance: Evaluate and Improve Retrievers on Perspective Awareness [56.42192735214931]
retrievers are expected to not only rely on the semantic relevance between the documents and the queries but also recognize the nuanced intents or perspectives behind a user query.
In this work, we study whether retrievers can recognize and respond to different perspectives of the queries.
We show that current retrievers have limited awareness of subtly different perspectives in queries and can also be biased toward certain perspectives.
arXiv Detail & Related papers (2024-05-04T17:10:00Z) - Bidirectional End-to-End Learning of Retriever-Reader Paradigm for Entity Linking [57.44361768117688]
We propose BEER$2$, a Bidirectional End-to-End training framework for Retriever and Reader.
Through our designed bidirectional end-to-end training, BEER$2$ guides the retriever and the reader to learn from each other, make progress together, and ultimately improve EL performance.
arXiv Detail & Related papers (2023-06-21T13:04:30Z) - Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline.
This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z) - ReFIT: Relevance Feedback from a Reranker during Inference [109.33278799999582]
Retrieve-and-rerank is a prevalent framework in neural information retrieval.
We propose to leverage the reranker to improve recall by making it provide relevance feedback to the retriever at inference time.
arXiv Detail & Related papers (2023-05-19T15:30:33Z) - End-to-End Training of Multi-Document Reader and Retriever for
Open-Domain Question Answering [36.80395759543162]
We present an end-to-end differentiable training method for retrieval-augmented open-domain question answering systems.
We model retrieval decisions as latent variables over sets of relevant documents.
Our proposed method outperforms all existing approaches of comparable size by 2-3% exact match points.
arXiv Detail & Related papers (2021-06-09T19:25:37Z) - Reader-Guided Passage Reranking for Open-Domain Question Answering [103.18340682345533]
We propose a simple and effective passage reranking method, Reader-guIDEd Reranker (Rider)
Rider achieves 10 to 20 absolute gains in top-1 retrieval accuracy and 1 to 4 Exact Match (EM) score gains without refining the retriever or reader.
Rider achieves 48.3 EM on the Natural Questions dataset and 66.4 on the TriviaQA dataset when only 1,024 tokens (7.8 passages on average) are used as the reader input.
arXiv Detail & Related papers (2021-01-01T18:54:19Z) - Distilling Knowledge from Reader to Retriever for Question Answering [16.942581590186343]
We propose a technique to learn retriever models for downstream tasks, inspired by knowledge distillation.
We evaluate our method on question answering, obtaining state-of-the-art results.
arXiv Detail & Related papers (2020-12-08T17:36:34Z) - Open-Domain Question Answering with Pre-Constructed Question Spaces [70.13619499853756]
Open-domain question answering aims at solving the task of locating the answers to user-generated questions in massive collections of documents.
There are two families of solutions available: retriever-readers, and knowledge-graph-based approaches.
We propose a novel algorithm with a reader-retriever structure that differs from both families.
arXiv Detail & Related papers (2020-06-02T04:31:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.