P^3 Ranker: Mitigating the Gaps between Pre-training and Ranking
Fine-tuning with Prompt-based Learning and Pre-finetuning
- URL: http://arxiv.org/abs/2205.01886v2
- Date: Thu, 5 May 2022 01:02:04 GMT
- Title: P^3 Ranker: Mitigating the Gaps between Pre-training and Ranking
Fine-tuning with Prompt-based Learning and Pre-finetuning
- Authors: Xiaomeng Hu, Shi Yu, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu, Ge Yu
- Abstract summary: We identify and study the two mismatches between pre-training and ranking fine-tuning.
To mitigate these gaps, we propose Pre-trained, Prompt-learned and Pre-finetuned Neural Ranker (P3 Ranker)
Experiments on MS MARCO and Robust04 show the superior performances of P3 Ranker in few-shot ranking.
- Score: 38.60274348013499
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Compared to other language tasks, applying pre-trained language models (PLMs)
for search ranking often requires more nuances and training signals. In this
paper, we identify and study the two mismatches between pre-training and
ranking fine-tuning: the training schema gap regarding the differences in
training objectives and model architectures, and the task knowledge gap
considering the discrepancy between the knowledge needed in ranking and that
learned during pre-training. To mitigate these gaps, we propose Pre-trained,
Prompt-learned and Pre-finetuned Neural Ranker (P^3 Ranker). P^3 Ranker
leverages prompt-based learning to convert the ranking task into a pre-training
like schema and uses pre-finetuning to initialize the model on intermediate
supervised tasks. Experiments on MS MARCO and Robust04 show the superior
performances of P^3 Ranker in few-shot ranking. Analyses reveal that P^3 Ranker
is able to better accustom to the ranking task through prompt-based learning
and retrieve necessary ranking-oriented knowledge gleaned in pre-finetuning,
resulting in data-efficient PLM adaptation. Our code is available at
https://github.com/NEUIR/P3Ranker.
Related papers
- RankOOD - Class Ranking-based Out-of-Distribution Detection [5.447909365133452]
We propose a rank-based Out-of-Distribution (OOD) detection approach based on training a model with the Placket-Luce loss.<n>Our approach is based on the insight that with a deep learning model trained using the Cross Entropy Loss, in-distribution (ID) class prediction induces a ranking pattern for each ID class prediction.
arXiv Detail & Related papers (2025-11-25T07:02:56Z) - RLP: Reinforcement as a Pretraining Objective [103.45068938532923]
We present an information-driven reinforcement pretraining objective that brings the core spirit of reinforcement learning -- exploration -- to the last phase of pretraining.<n>This training objective essentially encourages the model to think for itself before predicting what comes next, thus teaching an independent thinking behavior earlier in the pretraining.<n> Specifically, RLP reframes reinforcement learning for reasoning as a pretraining objective on ordinary text, bridging the gap between next-token prediction and the emergence of useful chain-of-thought reasoning.
arXiv Detail & Related papers (2025-09-26T17:53:54Z) - ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability [83.16850534680505]
We propose an automated reasoning-intensive training data synthesis framework.<n>A self-consistency data filtering mechanism is designed to ensure the data quality.<n>Our trained reasoning-intensive reranker textbfReasonRank achieves state-of-the-art (SOTA) performance 40.6 on the BRIGHT leaderboard.
arXiv Detail & Related papers (2025-08-09T17:26:18Z) - IRanker: Towards Ranking Foundation Model [26.71771958251611]
We propose to unify ranking tasks using a single ranking foundation model (FM)<n>IRanker is a ranking framework with reinforcement learning (RL) and iterative decoding.<n>We show that a single IRanker-3B achieves state-of-the-art results on several datasets.
arXiv Detail & Related papers (2025-06-25T17:56:06Z) - Full Stage Learning to Rank: A Unified Framework for Multi-Stage Systems [40.199257203898846]
We propose an improved ranking principle for multi-stage systems, namely the Generalized Probability Ranking Principle (GPRP)
GPRP emphasizes both the selection bias in each stage of the system pipeline as well as the underlying interest of users.
Our core idea is to first estimate the selection bias in the subsequent stages and then learn a ranking model that best complies with the downstream modules' selection bias.
arXiv Detail & Related papers (2024-05-08T06:35:04Z) - InRank: Incremental Low-Rank Learning [85.6380047359139]
gradient-based training implicitly regularizes neural networks towards low-rank solutions through a gradual increase of the rank during training.
Existing training algorithms do not exploit the low-rank property to improve computational efficiency.
We design a new training algorithm Incremental Low-Rank Learning (InRank), which explicitly expresses cumulative weight updates as low-rank matrices.
arXiv Detail & Related papers (2023-06-20T03:03:04Z) - QUERT: Continual Pre-training of Language Model for Query Understanding
in Travel Domain Search [15.026682829320261]
We propose QUERT, A Continual Pre-trained Language Model for QUERy Understanding in Travel Domain Search.
Quert is jointly trained on four tailored pre-training tasks to the characteristics of query in travel domain search.
To check on the improvement of QUERT to online business, we deploy QUERT and perform A/B testing on Fliggy APP.
arXiv Detail & Related papers (2023-06-11T15:39:59Z) - Decouple knowledge from parameters for plug-and-play language modeling [77.5601135412186]
We introduce PlugLM, a pre-training model with differentiable plug-in memory(DPM)
The key intuition is to decouple the knowledge storage from model parameters with an editable and scalable key-value memory.
PlugLM obtains 3.95 F1 improvements across four domains on average without any in-domain pre-training.
arXiv Detail & Related papers (2023-05-19T10:01:55Z) - Learning a Better Initialization for Soft Prompts via Meta-Learning [58.53984967461313]
We propose MetaPT (Meta-learned Prompt Tuning) to improve prompt tuning.
We introduce the structure by first clustering pre-training data into different auxiliary tasks.
We use these tasks to pre-train prompts with a meta-learning algorithm.
arXiv Detail & Related papers (2022-05-25T03:50:23Z) - PiRank: Learning To Rank via Differentiable Sorting [85.28916333414145]
We propose PiRank, a new class of differentiable surrogates for ranking.
We show that PiRank exactly recovers the desired metrics in the limit of zero temperature.
arXiv Detail & Related papers (2020-12-12T05:07:36Z) - Pre-training Is (Almost) All You Need: An Application to Commonsense
Reasoning [61.32992639292889]
Fine-tuning of pre-trained transformer models has become the standard approach for solving common NLP tasks.
We introduce a new scoring method that casts a plausibility ranking task in a full-text format.
We show that our method provides a much more stable training phase across random restarts.
arXiv Detail & Related papers (2020-04-29T10:54:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.