Related papers: RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking

RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking

URL: http://arxiv.org/abs/2110.07367v2
Date: Sun, 23 Apr 2023 16:56:52 GMT
Title: RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking
Authors: Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua Wu, Haifeng Wang and Ji-Rong Wen
Abstract summary: We propose a novel joint training approach for dense passage retrieval and passage re-ranking. A major contribution is that we introduce the dynamic listwise distillation, where we design a unified listwise training approach for both the retriever and the re-ranker. During the dynamic distillation, the retriever and the re-ranker can be adaptively improved according to each other's relevance information.
Score: 89.82301733609279
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In various natural language processing tasks, passage retrieval and passage re-ranking are two key procedures in finding and ranking relevant information. Since both the two procedures contribute to the final performance, it is important to jointly optimize them in order to achieve mutual improvement. In this paper, we propose a novel joint training approach for dense passage retrieval and passage re-ranking. A major contribution is that we introduce the dynamic listwise distillation, where we design a unified listwise training approach for both the retriever and the re-ranker. During the dynamic distillation, the retriever and the re-ranker can be adaptively improved according to each other's relevance information. We also propose a hybrid data augmentation strategy to construct diverse training instances for listwise training approach. Extensive experiments show the effectiveness of our approach on both MSMARCO and Natural Questions datasets. Our code is available at https://github.com/PaddlePaddle/RocketQA.

Related papers

Training a Utility-based Retriever Through Shared Context Attribution for Retrieval-Augmented Language Models [51.608246558235166]
SCARLet is a framework for training utility-based retrievers in RALMs. It incorporates two key factors, multi-task generalization and inter-passage interaction. We evaluate our approach on ten datasets across various tasks, both in-domain and out-of-domain.
arXiv Detail & Related papers (2025-04-01T09:28:28Z)
SelaVPR++: Towards Seamless Adaptation of Foundation Models for Efficient Place Recognition [69.58329995485158]
Recent studies show that the visual place recognition (VPR) method using pre-trained visual foundation models can achieve promising performance. We propose a novel method to realize seamless adaptation of foundation models to VPR. In pursuit of higher efficiency and better performance, we propose an extension of the SelaVPR, called SelaVPR++.
arXiv Detail & Related papers (2025-02-23T15:01:09Z)
Birdie: Advancing State Space Models with Reward-Driven Objectives and Curricula [23.071384759427072]
State space models (SSMs) offer advantages over Transformers but struggle with tasks requiring long-range in-context retrieval-like text copying, associative recall, and question answering over long contexts. We propose a novel training procedure, Birdie, that significantly enhances the in-context retrieval capabilities of SSMs without altering their architecture.
arXiv Detail & Related papers (2024-11-01T21:01:13Z)
Improve Dense Passage Retrieval with Entailment Tuning [22.39221206192245]
Key to a retrieval system is to calculate relevance scores to query and passage pairs. We observed that a major class of relevance aligns with the concept of entailment in NLI tasks. We design a method called entailment tuning to improve the embedding of dense retrievers.
arXiv Detail & Related papers (2024-10-21T09:18:30Z)
Unsupervised Dense Retrieval with Relevance-Aware Contrastive Pre-Training [81.3781338418574]
We propose relevance-aware contrastive learning. We consistently improve the SOTA unsupervised Contriever model on the BEIR and open-domain QA retrieval benchmarks. Our method can not only beat BM25 after further pre-training on the target corpus but also serves as a good few-shot learner.
arXiv Detail & Related papers (2023-06-05T18:20:27Z)
Hybrid and Collaborative Passage Reranking [144.83902343298112]
We propose a Hybrid and Collaborative Passage Reranking (HybRank) method. It incorporates the lexical and semantic properties of sparse and dense retrievers for reranking. Built on off-the-shelf retriever features, HybRank is a plug-in reranker capable of enhancing arbitrary passage lists.
arXiv Detail & Related papers (2023-05-16T09:38:52Z)
Cooperative Retriever and Ranker in Deep Recommenders [75.35463122701135]
Deep recommender systems (DRS) are intensively applied in modern web services. DRS employs a two-stage workflow: retrieval and ranking, to generate its recommendation results. It remains to explore effective collaborations between retriever and ranker.
arXiv Detail & Related papers (2022-06-28T03:41:50Z)
Generic resources are what you need: Style transfer tasks without task-specific parallel training data [4.181049191386633]
Style transfer aims to rewrite a source text in a different target style while preserving its content. We propose a novel approach to this task that leverages generic resources. We adopt a multi-step procedure which builds on a generic pre-trained sequence-to-sequence model.
arXiv Detail & Related papers (2021-09-09T20:15:02Z)
PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval [87.68667887072324]
We propose a novel approach that leverages query-centric and PAssage-centric sImilarity Relations (called PAIR) for dense passage retrieval. To implement our approach, we make three major technical contributions by introducing formal formulations of the two kinds of similarity relations. Our approach significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions datasets.
arXiv Detail & Related papers (2021-08-13T02:07:43Z)
RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering [55.280108297460636]
In open-domain question answering, dense passage retrieval has become a new paradigm to retrieve relevant passages for finding answers. We propose an optimized training approach, called RocketQA, to improve dense passage retrieval. We make three major technical contributions in RocketQA, namely cross-batch negatives, denoised hard negatives and data augmentation.
arXiv Detail & Related papers (2020-10-16T06:54:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.