Related papers: ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability

ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability

URL: http://arxiv.org/abs/2508.07050v1
Date: Sat, 09 Aug 2025 17:26:18 GMT
Title: ReasonRank: Empowering Passage Ranking with Strong Reasoning Ability
Authors: Wenhan Liu, Xinyu Ma, Weiwei Sun, Yutao Zhu, Yuchen Li, Dawei Yin, Zhicheng Dou,
Abstract summary: We propose an automated reasoning-intensive training data synthesis framework.<n>A self-consistency data filtering mechanism is designed to ensure the data quality.<n>Our trained reasoning-intensive reranker textbfReasonRank achieves state-of-the-art (SOTA) performance 40.6 on the BRIGHT leaderboard.
Score: 41.99845885135309
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Model (LLM) based listwise ranking has shown superior performance in many passage ranking tasks. With the development of Large Reasoning Models, many studies have demonstrated that step-by-step reasoning during test-time helps improve listwise ranking performance. However, due to the scarcity of reasoning-intensive training data, existing rerankers perform poorly in many complex ranking scenarios and the ranking ability of reasoning-intensive rerankers remains largely underdeveloped. In this paper, we first propose an automated reasoning-intensive training data synthesis framework, which sources training queries and passages from diverse domains and applies DeepSeek-R1 to generate high-quality training labels. A self-consistency data filtering mechanism is designed to ensure the data quality. To empower the listwise reranker with strong reasoning ability, we further propose a two-stage post-training approach, which includes a cold-start supervised fine-tuning (SFT) stage for reasoning pattern learning and a reinforcement learning (RL) stage for further ranking ability enhancement. During the RL stage, based on the nature of listwise ranking, we design a multi-view ranking reward, which is more effective than a ranking metric-based reward. Extensive experiments demonstrate that our trained reasoning-intensive reranker \textbf{ReasonRank} outperforms existing baselines significantly and also achieves much lower latency than pointwise reranker Rank1. \textbf{Through further experiments, our ReasonRank has achieved state-of-the-art (SOTA) performance 40.6 on the BRIGHT leaderboard\footnote{https://brightbenchmark.github.io/}.} Our codes are available at https://github.com/8421BCD/ReasonRank.

Related papers

Ranking-aware Reinforcement Learning for Ordinal Ranking [19.678002354790582]
We propose Ranking-Aware Reinforcement Learning (RARL), a novel RL framework that explicitly learns these relationships.<n>RARL features a unified objective that integrates regression and Learning-to-Rank (L2R), enabling mutual improvement between the two tasks.<n>To further enhance training, we introduce Response Mutation Operations (RMO), which inject controlled noise to improve exploration and prevent stagnation at saddle points.
arXiv Detail & Related papers (2026-01-28T13:22:42Z)
Rethinking Reasoning in Document Ranking: Why Chain-of-Thought Falls Short [36.93384080571354]
Document reranking is a key component in information retrieval (IR)<n>We present the first systematic study of reasoning in reranking across both pointwise and listwise settings.
arXiv Detail & Related papers (2025-10-10T03:59:17Z)
ERank: Fusing Supervised Fine-Tuning and Reinforcement Learning for Effective and Efficient Text Reranking [33.25740773392183]
ERank is a highly effective and efficient pointwise reranker built from a reasoning LLM that excels across diverse relevance scenarios.<n>We propose a novel two-stage training pipeline that begins with Supervised Fine-Tuning (SFT)<n>In this stage, we move beyond binary labels and train the model generatively to output fine grained integer scores, which significantly enhances relevance discrimination.<n>We evaluate the ERank reranker on the BRIGHT, FollowIR, TREC DL, and BEIR benchmarks, demonstrating superior effectiveness and robustness compared to existing approaches.
arXiv Detail & Related papers (2025-08-30T14:56:53Z)
TFRank: Think-Free Reasoning Enables Practical Pointwise LLM Ranking [21.930228130429573]
Reasoning-intensive ranking models built on Large Language Models (LLMs) have made notable progress.<n>Existing approaches often rely on large-scale LLMs and explicit Chain-of-Thought (CoT) reasoning.<n>We propose textbfTFRank, an efficient pointwise reasoning ranker based on small-scale LLMs.
arXiv Detail & Related papers (2025-08-13T06:47:58Z)
Shuffle-R1: Efficient RL framework for Multimodal Large Language Models via Data-centric Dynamic Shuffle [65.14124923451077]
Reinforcement learning (RL) has emerged as an effective post-training paradigm for enhancing the reasoning capabilities of multimodal large language model (MLLM)<n>However, current RL pipelines often suffer from training inefficiencies caused by two underexplored issues: Advantage Collapsing and Rollout Silencing.<n>We propose Shuffle-R1, a simple yet principled framework that improves RL fine-tuning efficiency by dynamically restructuring trajectory sampling and batch composition.
arXiv Detail & Related papers (2025-08-07T17:53:47Z)
IRanker: Towards Ranking Foundation Model [26.71771958251611]
We propose to unify ranking tasks using a single ranking foundation model (FM)<n>IRanker is a ranking framework with reinforcement learning (RL) and iterative decoding.<n>We show that a single IRanker-3B achieves state-of-the-art results on several datasets.
arXiv Detail & Related papers (2025-06-25T17:56:06Z)
CoRanking: Collaborative Ranking with Small and Large Ranking Agents [94.09834629572403]
Large Language Models (LLMs) have demonstrated superior listwise ranking performance.<n>CoRanking combines small and large ranking models for efficient and effective ranking.
arXiv Detail & Related papers (2025-03-30T13:00:52Z)
Learning Cascade Ranking as One Network [34.530252769521624]
Cascade Ranking is a prevalent architecture in large-scale top-k selection systems like recommendation and advertising platforms.<n>Recent advances have introduced interaction-aware training paradigms, but still struggle to align training objectives with the goal of the entire cascade ranking.<n>We propose LCRON, which introduces a novel surrogate loss function derived from the lower bound probability that ground truth items are selected by cascade ranking.
arXiv Detail & Related papers (2025-03-12T15:52:51Z)
Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning [76.50690734636477]
We introduce Rank-R1, a novel LLM-based reranker that performs reasoning over both the user query and candidate documents before performing the ranking task.<n>Our experiments on the TREC DL and BRIGHT datasets show that Rank-R1 is highly effective, especially for complex queries.
arXiv Detail & Related papers (2025-03-08T03:14:26Z)
Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models [46.87216968390808]
This paper investigates the under-explored area of low-rank weight training for large-scale Conformer-based speech recognition models from scratch. Applying a low-rank structure exclusively to the attention modules can unexpectedly enhance performance. Feed-forward layers present greater challenges, as they begin to exhibit performance degradation with a moderate 50% rank reduction.
arXiv Detail & Related papers (2024-10-10T09:58:35Z)
FIRST: Faster Improved Listwise Reranking with Single Token Decoding [56.727761901751194]
First, we introduce FIRST, a novel listwise LLM reranking approach leveraging the output logits of the first generated identifier to directly obtain a ranked ordering of the candidates. Empirical results demonstrate that FIRST accelerates inference by 50% while maintaining a robust ranking performance with gains across the BEIR benchmark. Our results show that LLM rerankers can provide a stronger distillation signal compared to cross-encoders, yielding substantial improvements in retriever recall after relevance feedback.
arXiv Detail & Related papers (2024-06-21T21:27:50Z)
Improving Language Model Reasoning with Self-motivated Learning [60.779625789039486]
textitSelf-motivated Learning framework motivates the model itself to automatically generate rationales on existing datasets. We train a reward model with the rank to evaluate the quality of rationales, and improve the performance of reasoning through reinforcement learning.
arXiv Detail & Related papers (2024-04-10T14:05:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.