LimRank: Less is More for Reasoning-Intensive Information Reranking
- URL: http://arxiv.org/abs/2510.23544v1
- Date: Mon, 27 Oct 2025 17:19:37 GMT
- Title: LimRank: Less is More for Reasoning-Intensive Information Reranking
- Authors: Tingyu Song, Yilun Zhao, Siyue Zhang, Chen Zhao, Arman Cohan,
- Abstract summary: Existing approaches typically rely on large-scale fine-tuning to adapt LLMs for information reranking tasks.<n>In this work, we demonstrate that modern LLMs can be effectively adapted using only minimal, high-quality supervision.
- Score: 58.32304478331711
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing approaches typically rely on large-scale fine-tuning to adapt LLMs for information reranking tasks, which is computationally expensive. In this work, we demonstrate that modern LLMs can be effectively adapted using only minimal, high-quality supervision. To enable this, we design LIMRANK-SYNTHESIZER, a reusable and open-source pipeline for generating diverse, challenging, and realistic reranking examples. Using this synthetic data, we fine-tune our reranker model, LIMRANK. We evaluate LIMRANK on two challenging benchmarks, i.e., BRIGHT for reasoning-intensive retrieval and FollowIR for instruction-following retrieval. Our experiments demonstrate that LIMRANK achieves competitive performance, while being trained on less than 5% of the data typically used in prior work. Further ablation studies demonstrate the effectiveness of LIMRANK-SYNTHESIZER and the strong generalization capabilities of LIMRANK across downstream tasks, including scientific literature search and retrieval-augmented generation for knowledge-intensive problem solving.
Related papers
- Principled Synthetic Data Enables the First Scaling Laws for LLMs in Recommendation [27.59197535041953]
Large Language Models (LLMs) represent a promising frontier for recommender systems.<n>This paper introduces a novel, layered framework for generating high-quality synthetic data.<n>We empirically demonstrate, for the first time, robust power-law scaling for an LLM that is continually pre-trained on our high-quality, recommendation-specific data.
arXiv Detail & Related papers (2026-02-07T01:15:15Z) - MiniRec: Data-Efficient Reinforcement Learning for LLM-based Recommendation [50.417769112326546]
MiniRec is a data selection framework tailored for RL-based large language models (LLMs) recommendation.<n>It evaluates sample learnability using key RL signals -- rewards -- pruning samples that are too easy (too high reward) or too difficult (consistently low reward)
arXiv Detail & Related papers (2026-02-04T07:15:49Z) - Language Ranker: A Lightweight Ranking framework for LLM Decoding [70.01564145836129]
This paper conceptualizes the decoding process as analogous to the ranking stage in recommendation pipelines.<n>Motivated by this insight, we propose Language Ranker, a novel framework that introduces a lightweight module to rerank candidate responses.<n> Experiments show that Language Ranker achieves performance comparable to large-scale reward models, while requiring only 0.5M additional parameters.
arXiv Detail & Related papers (2025-10-23T17:56:46Z) - Retaining by Doing: The Role of On-Policy Data in Mitigating Forgetting [40.80967570661867]
Adapting language models to new tasks via post-training carries the risk of degrading existing capabilities.<n>We compare the forgetting patterns of two widely adopted post-training methods: supervised fine-tuning (SFT) and reinforcement learning (RL)<n>RL leads to less forgetting than SFT while achieving comparable or higher target task performance.
arXiv Detail & Related papers (2025-10-21T17:59:41Z) - RJE: A Retrieval-Judgment-Exploration Framework for Efficient Knowledge Graph Question Answering with LLMs [18.947344953344995]
Retrieval-Judgment-Exploration (RJE) is a framework that retrieves refined reasoning paths, evaluates their sufficiency, and conditionally explores additional evidence.<n> RJE substantially reduces the number of LLM calls and token usage compared to agent-based methods, yielding significant efficiency improvements.
arXiv Detail & Related papers (2025-09-25T03:56:18Z) - How Good are LLM-based Rerankers? An Empirical Analysis of State-of-the-Art Reranking Models [24.90505576458548]
We evaluate state-of-the-art reranking methods, including large language model (LLM)-based, lightweight contextual, and zero-shot approaches.<n>Our primary goal is to determine, through controlled and fair comparisons, whether a performance disparity exists between LLM-based rerankers and their lightweight counterparts.
arXiv Detail & Related papers (2025-08-22T19:30:04Z) - R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning [87.30285670315334]
textbfR1-Searcher is a novel two-stage outcome-based RL approach designed to enhance the search capabilities of Large Language Models.<n>Our framework relies exclusively on RL, without requiring process rewards or distillation for a cold start.<n>Our experiments demonstrate that our method significantly outperforms previous strong RAG methods, even when compared to the closed-source GPT-4o-mini.
arXiv Detail & Related papers (2025-03-07T17:14:44Z) - A Framework for Using LLMs for Repository Mining Studies in Empirical Software Engineering [12.504438766461027]
Large Language Models (LLMs) have transformed Software Engineering (SE) by providing innovative methods for analyzing software repositories.<n>Our research packages a framework, coined Prompt Refinement and Insights for Mining Empirical Software repositories (PRIMES)<n>Our findings indicate that standardizing prompt engineering and using PRIMES can enhance the reliability and accuracy of studies utilizing LLMs.
arXiv Detail & Related papers (2024-11-15T06:08:57Z) - EVOLvE: Evaluating and Optimizing LLMs For In-Context Exploration [76.66831821738927]
Large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty.<n>We measure LLMs' (in)ability to make optimal decisions in bandits, a state-less reinforcement learning setting relevant to many applications.<n>Motivated by the existence of optimal exploration algorithms, we propose efficient ways to integrate this algorithmic knowledge into LLMs.
arXiv Detail & Related papers (2024-10-08T17:54:03Z) - How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback.
Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities.
We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.