RocketQAv2: A Joint Training Method for Dense Passage Retrieval and
Passage Re-ranking
- URL: http://arxiv.org/abs/2110.07367v2
- Date: Sun, 23 Apr 2023 16:56:52 GMT
- Title: RocketQAv2: A Joint Training Method for Dense Passage Retrieval and
Passage Re-ranking
- Authors: Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua
Wu, Haifeng Wang and Ji-Rong Wen
- Abstract summary: We propose a novel joint training approach for dense passage retrieval and passage re-ranking.
A major contribution is that we introduce the dynamic listwise distillation, where we design a unified listwise training approach for both the retriever and the re-ranker.
During the dynamic distillation, the retriever and the re-ranker can be adaptively improved according to each other's relevance information.
- Score: 89.82301733609279
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In various natural language processing tasks, passage retrieval and passage
re-ranking are two key procedures in finding and ranking relevant information.
Since both the two procedures contribute to the final performance, it is
important to jointly optimize them in order to achieve mutual improvement. In
this paper, we propose a novel joint training approach for dense passage
retrieval and passage re-ranking. A major contribution is that we introduce the
dynamic listwise distillation, where we design a unified listwise training
approach for both the retriever and the re-ranker. During the dynamic
distillation, the retriever and the re-ranker can be adaptively improved
according to each other's relevance information. We also propose a hybrid data
augmentation strategy to construct diverse training instances for listwise
training approach. Extensive experiments show the effectiveness of our approach
on both MSMARCO and Natural Questions datasets. Our code is available at
https://github.com/PaddlePaddle/RocketQA.
Related papers
- Unsupervised Dense Retrieval with Relevance-Aware Contrastive
Pre-Training [81.3781338418574]
We propose relevance-aware contrastive learning.
We consistently improve the SOTA unsupervised Contriever model on the BEIR and open-domain QA retrieval benchmarks.
Our method can not only beat BM25 after further pre-training on the target corpus but also serves as a good few-shot learner.
arXiv Detail & Related papers (2023-06-05T18:20:27Z) - Hybrid and Collaborative Passage Reranking [144.83902343298112]
We propose a Hybrid and Collaborative Passage Reranking (HybRank) method.
It incorporates the lexical and semantic properties of sparse and dense retrievers for reranking.
Built on off-the-shelf retriever features, HybRank is a plug-in reranker capable of enhancing arbitrary passage lists.
arXiv Detail & Related papers (2023-05-16T09:38:52Z) - A Memory-Related Multi-Task Method Based on Task-Agnostic Exploration [26.17597857264231]
In contrast to imitation learning, there is no expert data, only the data collected through environmental exploration.
Since the action sequence to solve the new task may be the combination of trajectory segments of multiple training tasks, the test task and the solving strategy do not exist directly in the training data.
We propose a Memory-related Multi-task Method (M3) to address this problem.
arXiv Detail & Related papers (2022-09-09T03:02:49Z) - Cooperative Retriever and Ranker in Deep Recommenders [75.35463122701135]
Deep recommender systems (DRS) are intensively applied in modern web services.
DRS employs a two-stage workflow: retrieval and ranking, to generate its recommendation results.
It remains to explore effective collaborations between retriever and ranker.
arXiv Detail & Related papers (2022-06-28T03:41:50Z) - Generic resources are what you need: Style transfer tasks without
task-specific parallel training data [4.181049191386633]
Style transfer aims to rewrite a source text in a different target style while preserving its content.
We propose a novel approach to this task that leverages generic resources.
We adopt a multi-step procedure which builds on a generic pre-trained sequence-to-sequence model.
arXiv Detail & Related papers (2021-09-09T20:15:02Z) - PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense
Passage Retrieval [87.68667887072324]
We propose a novel approach that leverages query-centric and PAssage-centric sImilarity Relations (called PAIR) for dense passage retrieval.
To implement our approach, we make three major technical contributions by introducing formal formulations of the two kinds of similarity relations.
Our approach significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions datasets.
arXiv Detail & Related papers (2021-08-13T02:07:43Z) - RocketQA: An Optimized Training Approach to Dense Passage Retrieval for
Open-Domain Question Answering [55.280108297460636]
In open-domain question answering, dense passage retrieval has become a new paradigm to retrieve relevant passages for finding answers.
We propose an optimized training approach, called RocketQA, to improve dense passage retrieval.
We make three major technical contributions in RocketQA, namely cross-batch negatives, denoised hard negatives and data augmentation.
arXiv Detail & Related papers (2020-10-16T06:54:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.