ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability
- URL: http://arxiv.org/abs/2508.07050v1
- Date: Sat, 09 Aug 2025 17:26:18 GMT
- Title: ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability
- Authors: Wenhan Liu, Xinyu Ma, Weiwei Sun, Yutao Zhu, Yuchen Li, Dawei Yin, Zhicheng Dou,
- Abstract summary: We propose an automated reasoning-intensive training data synthesis framework.<n>A self-consistency data filtering mechanism is designed to ensure the data quality.<n>Our trained reasoning-intensive reranker textbfReasonRank achieves state-of-the-art (SOTA) performance 40.6 on the BRIGHT leaderboard.
- Score: 41.99845885135309
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Model (LLM) based listwise ranking has shown superior performance in many passage ranking tasks. With the development of Large Reasoning Models, many studies have demonstrated that step-by-step reasoning during test-time helps improve listwise ranking performance. However, due to the scarcity of reasoning-intensive training data, existing rerankers perform poorly in many complex ranking scenarios and the ranking ability of reasoning-intensive rerankers remains largely underdeveloped. In this paper, we first propose an automated reasoning-intensive training data synthesis framework, which sources training queries and passages from diverse domains and applies DeepSeek-R1 to generate high-quality training labels. A self-consistency data filtering mechanism is designed to ensure the data quality. To empower the listwise reranker with strong reasoning ability, we further propose a two-stage post-training approach, which includes a cold-start supervised fine-tuning (SFT) stage for reasoning pattern learning and a reinforcement learning (RL) stage for further ranking ability enhancement. During the RL stage, based on the nature of listwise ranking, we design a multi-view ranking reward, which is more effective than a ranking metric-based reward. Extensive experiments demonstrate that our trained reasoning-intensive reranker \textbf{ReasonRank} outperforms existing baselines significantly and also achieves much lower latency than pointwise reranker Rank1. \textbf{Through further experiments, our ReasonRank has achieved state-of-the-art (SOTA) performance 40.6 on the BRIGHT leaderboard\footnote{https://brightbenchmark.github.io/}.} Our codes are available at https://github.com/8421BCD/ReasonRank.
Related papers
- Ranking-aware Reinforcement Learning for Ordinal Ranking [19.678002354790582]
We propose Ranking-Aware Reinforcement Learning (RARL), a novel RL framework that explicitly learns these relationships.<n>RARL features a unified objective that integrates regression and Learning-to-Rank (L2R), enabling mutual improvement between the two tasks.<n>To further enhance training, we introduce Response Mutation Operations (RMO), which inject controlled noise to improve exploration and prevent stagnation at saddle points.
arXiv Detail & Related papers (2026-01-28T13:22:42Z) - Rethinking Reasoning in Document Ranking: Why Chain-of-Thought Falls Short [36.93384080571354]
Document reranking is a key component in information retrieval (IR)<n>We present the first systematic study of reasoning in reranking across both pointwise and listwise settings.
arXiv Detail & Related papers (2025-10-10T03:59:17Z) - ERank: Fusing Supervised Fine-Tuning and Reinforcement Learning for Effective and Efficient Text Reranking [33.25740773392183]
ERank is a highly effective and efficient pointwise reranker built from a reasoning LLM that excels across diverse relevance scenarios.<n>We propose a novel two-stage training pipeline that begins with Supervised Fine-Tuning (SFT)<n>In this stage, we move beyond binary labels and train the model generatively to output fine grained integer scores, which significantly enhances relevance discrimination.<n>We evaluate the ERank reranker on the BRIGHT, FollowIR, TREC DL, and BEIR benchmarks, demonstrating superior effectiveness and robustness compared to existing approaches.
arXiv Detail & Related papers (2025-08-30T14:56:53Z) - TFRank: Think-Free Reasoning Enables Practical Pointwise LLM Ranking [21.930228130429573]
Reasoning-intensive ranking models built on Large Language Models (LLMs) have made notable progress.<n>Existing approaches often rely on large-scale LLMs and explicit Chain-of-Thought (CoT) reasoning.<n>We propose textbfTFRank, an efficient pointwise reasoning ranker based on small-scale LLMs.
arXiv Detail & Related papers (2025-08-13T06:47:58Z) - Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle [65.14124923451077]
Reinforcement learning (RL) has emerged as an effective post-training paradigm for enhancing the reasoning capabilities of multimodal large language model (MLLM)<n>However, current RL pipelines often suffer from training inefficiencies caused by two underexplored issues: Advantage Collapsing and Rollout Silencing.<n>We propose Shuffle-R1, a simple yet principled framework that improves RL fine-tuning efficiency by dynamically restructuring trajectory sampling and batch composition.
arXiv Detail & Related papers (2025-08-07T17:53:47Z) - IRanker: Towards Ranking Foundation Model [26.71771958251611]
We propose to unify ranking tasks using a single ranking foundation model (FM)<n>IRanker is a ranking framework with reinforcement learning (RL) and iterative decoding.<n>We show that a single IRanker-3B achieves state-of-the-art results on several datasets.
arXiv Detail & Related papers (2025-06-25T17:56:06Z) - CoRanking: Collaborative Ranking with Small and Large Ranking Agents [94.09834629572403]
Large Language Models (LLMs) have demonstrated superior listwise ranking performance.<n>CoRanking combines small and large ranking models for efficient and effective ranking.
arXiv Detail & Related papers (2025-03-30T13:00:52Z) - Learning Cascade Ranking as One Network [34.530252769521624]
Cascade Ranking is a prevalent architecture in large-scale top-k selection systems like recommendation and advertising platforms.<n>Recent advances have introduced interaction-aware training paradigms, but still struggle to align training objectives with the goal of the entire cascade ranking.<n>We propose LCRON, which introduces a novel surrogate loss function derived from the lower bound probability that ground truth items are selected by cascade ranking.
arXiv Detail & Related papers (2025-03-12T15:52:51Z) - Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task.<n>Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z) - Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models [46.87216968390808]
This paper investigates the under-explored area of low-rank weight training for large-scale Conformer-based speech recognition models from scratch.
Applying a low-rank structure exclusively to the attention modules can unexpectedly enhance performance.
Feed-forward layers present greater challenges, as they begin to exhibit performance degradation with a moderate 50% rank reduction.
arXiv Detail & Related papers (2024-10-10T09:58:35Z) - FIRST: Faster Improved Listwise Reranking with Single Token Decoding [56.727761901751194]
First, we introduce FIRST, a novel listwise LLM reranking approach leveraging the output logits of the first generated identifier to directly obtain a ranked ordering of the candidates.
Empirical results demonstrate that FIRST accelerates inference by 50% while maintaining a robust ranking performance with gains across the BEIR benchmark.
Our results show that LLM rerankers can provide a stronger distillation signal compared to cross-encoders, yielding substantial improvements in retriever recall after relevance feedback.
arXiv Detail & Related papers (2024-06-21T21:27:50Z) - Improving Language Model Reasoning with Self-motivated Learning [60.779625789039486]
textitSelf-motivated Learning framework motivates the model itself to automatically generate rationales on existing datasets.
We train a reward model with the rank to evaluate the quality of rationales, and improve the performance of reasoning through reinforcement learning.
arXiv Detail & Related papers (2024-04-10T14:05:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.