Adversarial Retriever-Ranker for dense text retrieval
- URL: http://arxiv.org/abs/2110.03611v2
- Date: Fri, 8 Oct 2021 07:29:14 GMT
- Title: Adversarial Retriever-Ranker for dense text retrieval
- Authors: Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, Weizhu
Chen
- Abstract summary: We present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder retriever plus a cross-encoder ranker.
AR2 consistently and significantly outperforms existing dense retriever methods.
This includes the improvements on Natural Questions R@5 to 77.9%(+2.1%), TriviaQA R@5 to 78.2%(+1.4), and MS-MARCO MRR@10 to 39.5%(+1.3%)
- Score: 51.87158529880056
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current dense text retrieval models face two typical challenges. First, it
adopts a siamese dual-encoder architecture to encode query and document
independently for fast indexing and searching, whereas neglecting the
finer-grained term-wise interactions. This results in a sub-optimal recall
performance. Second, it highly relies on a negative sampling technique to build
up the negative documents in its contrastive loss. To address these challenges,
we present Adversarial Retriever-Ranker (AR2), which consists of a dual-encoder
retriever plus a cross-encoder ranker. The two models are jointly optimized
according to a minimax adversarial objective: the retriever learns to retrieve
negative documents to cheat the ranker, while the ranker learns to rank a
collection of candidates including both the ground-truth and the retrieved
ones, as well as providing progressive direct feedback to the dual-encoder
retriever. Through this adversarial game, the retriever gradually produces
harder negative documents to train a better ranker, whereas the cross-encoder
ranker provides progressive feedback to improve retriever. We evaluate AR2 on
three benchmarks. Experimental results show that AR2 consistently and
significantly outperforms existing dense retriever methods and achieves new
state-of-the-art results on all of them. This includes the improvements on
Natural Questions R@5 to 77.9%(+2.1%), TriviaQA R@5 to 78.2%(+1.4), and
MS-MARCO MRR@10 to 39.5%(+1.3%). We will make our code, models, and data
publicly available.
Related papers
- Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers [6.773411876899064]
inference-free sparse models lag far behind in terms of search relevance when compared to both sparse and dense siamese models.
We propose two different approaches for performance improvement. First, we introduce the IDF-aware FLOPS loss, which introduces Inverted Document Frequency (IDF) to the sparsification of representations.
We find that it mitigates the negative impact of the FLOPS regularization on search relevance, allowing the model to achieve a better balance between accuracy and efficiency.
arXiv Detail & Related papers (2024-11-07T03:46:43Z) - Boot and Switch: Alternating Distillation for Zero-Shot Dense Retrieval [50.47192086219752]
$texttABEL$ is a simple but effective unsupervised method to enhance passage retrieval in zero-shot settings.
By either fine-tuning $texttABEL$ on labelled data or integrating it with existing supervised dense retrievers, we achieve state-of-the-art results.
arXiv Detail & Related papers (2023-11-27T06:22:57Z) - ReFIT: Relevance Feedback from a Reranker during Inference [109.33278799999582]
Retrieve-and-rerank is a prevalent framework in neural information retrieval.
We propose to leverage the reranker to improve recall by making it provide relevance feedback to the retriever at inference time.
arXiv Detail & Related papers (2023-05-19T15:30:33Z) - LED: Lexicon-Enlightened Dense Retriever for Large-Scale Retrieval [68.85686621130111]
We propose to make a dense retriever align a well-performing lexicon-aware representation model.
We evaluate our model on three public benchmarks, which shows that with a comparable lexicon-aware retriever as the teacher, our proposed dense model can bring consistent and significant improvements.
arXiv Detail & Related papers (2022-08-29T15:09:28Z) - Towards Robust Ranker for Text Retrieval [83.15191578888188]
A ranker plays an indispensable role in the de facto'retrieval & rerank' pipeline.
A ranker plays an indispensable role in the de facto'retrieval & rerank' pipeline.
arXiv Detail & Related papers (2022-06-16T10:27:46Z) - LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text
Retrieval [55.097573036580066]
Experimental results show that LaPraDoR achieves state-of-the-art performance compared with supervised dense retrieval models.
Compared to re-ranking, our lexicon-enhanced approach can be run in milliseconds (22.5x faster) while achieving superior performance.
arXiv Detail & Related papers (2022-03-11T18:53:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.