RocketQAv2: A Joint Training Method for Dense Passage Retrieval and
Passage Re-ranking
- URL: http://arxiv.org/abs/2110.07367v2
- Date: Sun, 23 Apr 2023 16:56:52 GMT
- Title: RocketQAv2: A Joint Training Method for Dense Passage Retrieval and
Passage Re-ranking
- Authors: Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua
Wu, Haifeng Wang and Ji-Rong Wen
- Abstract summary: We propose a novel joint training approach for dense passage retrieval and passage re-ranking.
A major contribution is that we introduce the dynamic listwise distillation, where we design a unified listwise training approach for both the retriever and the re-ranker.
During the dynamic distillation, the retriever and the re-ranker can be adaptively improved according to each other's relevance information.
- Score: 89.82301733609279
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In various natural language processing tasks, passage retrieval and passage
re-ranking are two key procedures in finding and ranking relevant information.
Since both the two procedures contribute to the final performance, it is
important to jointly optimize them in order to achieve mutual improvement. In
this paper, we propose a novel joint training approach for dense passage
retrieval and passage re-ranking. A major contribution is that we introduce the
dynamic listwise distillation, where we design a unified listwise training
approach for both the retriever and the re-ranker. During the dynamic
distillation, the retriever and the re-ranker can be adaptively improved
according to each other's relevance information. We also propose a hybrid data
augmentation strategy to construct diverse training instances for listwise
training approach. Extensive experiments show the effectiveness of our approach
on both MSMARCO and Natural Questions datasets. Our code is available at
https://github.com/PaddlePaddle/RocketQA.
Related papers
- Birdie: Advancing State Space Models with Reward-Driven Objectives and Curricula [23.071384759427072]
State space models (SSMs) offer advantages over Transformers but struggle with tasks requiring long-range in-context retrieval-like text copying, associative recall, and question answering over long contexts.
We propose a novel training procedure, Birdie, that significantly enhances the in-context retrieval capabilities of SSMs without altering their architecture.
arXiv Detail & Related papers (2024-11-01T21:01:13Z) - Improve Dense Passage Retrieval with Entailment Tuning [22.39221206192245]
Key to a retrieval system is to calculate relevance scores to query and passage pairs.
We observed that a major class of relevance aligns with the concept of entailment in NLI tasks.
We design a method called entailment tuning to improve the embedding of dense retrievers.
arXiv Detail & Related papers (2024-10-21T09:18:30Z) - Unsupervised Dense Retrieval with Relevance-Aware Contrastive
Pre-Training [81.3781338418574]
We propose relevance-aware contrastive learning.
We consistently improve the SOTA unsupervised Contriever model on the BEIR and open-domain QA retrieval benchmarks.
Our method can not only beat BM25 after further pre-training on the target corpus but also serves as a good few-shot learner.
arXiv Detail & Related papers (2023-06-05T18:20:27Z) - Hybrid and Collaborative Passage Reranking [144.83902343298112]
We propose a Hybrid and Collaborative Passage Reranking (HybRank) method.
It incorporates the lexical and semantic properties of sparse and dense retrievers for reranking.
Built on off-the-shelf retriever features, HybRank is a plug-in reranker capable of enhancing arbitrary passage lists.
arXiv Detail & Related papers (2023-05-16T09:38:52Z) - Cooperative Retriever and Ranker in Deep Recommenders [75.35463122701135]
Deep recommender systems (DRS) are intensively applied in modern web services.
DRS employs a two-stage workflow: retrieval and ranking, to generate its recommendation results.
It remains to explore effective collaborations between retriever and ranker.
arXiv Detail & Related papers (2022-06-28T03:41:50Z) - Generic resources are what you need: Style transfer tasks without
task-specific parallel training data [4.181049191386633]
Style transfer aims to rewrite a source text in a different target style while preserving its content.
We propose a novel approach to this task that leverages generic resources.
We adopt a multi-step procedure which builds on a generic pre-trained sequence-to-sequence model.
arXiv Detail & Related papers (2021-09-09T20:15:02Z) - PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense
Passage Retrieval [87.68667887072324]
We propose a novel approach that leverages query-centric and PAssage-centric sImilarity Relations (called PAIR) for dense passage retrieval.
To implement our approach, we make three major technical contributions by introducing formal formulations of the two kinds of similarity relations.
Our approach significantly outperforms previous state-of-the-art models on both MSMARCO and Natural Questions datasets.
arXiv Detail & Related papers (2021-08-13T02:07:43Z) - RocketQA: An Optimized Training Approach to Dense Passage Retrieval for
Open-Domain Question Answering [55.280108297460636]
In open-domain question answering, dense passage retrieval has become a new paradigm to retrieve relevant passages for finding answers.
We propose an optimized training approach, called RocketQA, to improve dense passage retrieval.
We make three major technical contributions in RocketQA, namely cross-batch negatives, denoised hard negatives and data augmentation.
arXiv Detail & Related papers (2020-10-16T06:54:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.