Adaptive Retrieval for Reasoning-Intensive Retrieval
- URL: http://arxiv.org/abs/2601.04618v1
- Date: Thu, 08 Jan 2026 05:46:50 GMT
- Title: Adaptive Retrieval for Reasoning-Intensive Retrieval
- Authors: Jongho Kim, Jaeyoung Kim, Seung-won Hwang, Jihyuk Kim, Yu Jin Kim, Moontae Lee,
- Abstract summary: Bridge documents are those that contribute to the reasoning process but are not directly relevant to the initial query.<n>Existing reasoning-based reranker pipelines attempt to surface these documents in ranking, they suffer from bounded recall.<n>We propose a framework that bridges this gap by repurposing reasoning plans as dense feedback signals for adaptive retrieval.
- Score: 60.30588731127791
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study leveraging adaptive retrieval to ensure sufficient "bridge" documents are retrieved for reasoning-intensive retrieval. Bridge documents are those that contribute to the reasoning process yet are not directly relevant to the initial query. While existing reasoning-based reranker pipelines attempt to surface these documents in ranking, they suffer from bounded recall. Naive solution with adaptive retrieval into these pipelines often leads to planning error propagation. To address this, we propose REPAIR, a framework that bridges this gap by repurposing reasoning plans as dense feedback signals for adaptive retrieval. Our key distinction is enabling mid-course correction during reranking through selective adaptive retrieval, retrieving documents that support the pivotal plan. Experimental results on reasoning-intensive retrieval and complex QA tasks demonstrate that our method outperforms existing baselines by 5.6%pt.
Related papers
- Query Decomposition for RAG: Balancing Exploration-Exploitation [83.79639293409802]
RAG systems address complex user requests by decomposing them into subqueries, retrieving potentially relevant documents for each, and then aggregating them to generate an answer.<n>We formulate query decomposition and document retrieval in an exploitation-exploration setting, where retrieving one document at a time builds a belief about the utility of a given sub-queries.<n>Our main finding is that estimating document relevance using rank information and human judgments yields a 35% gain in document-level precision, 15% increase in alpha-nDCG, and better performance on the downstream task of long-form generation.
arXiv Detail & Related papers (2025-10-21T13:37:11Z) - Beyond Sequential Reranking: Reranker-Guided Search Improves Reasoning Intensive Retrieval [8.57583804155738]
We introduce Reranker-Guided-Search (RGS), a novel approach to retrieve documents according to reranker preferences.<n>Our method uses a greedy search on proximity graphs generated by approximate nearest neighbor algorithms.<n> Experimental results demonstrate substantial performance improvements across multiple benchmarks.
arXiv Detail & Related papers (2025-09-08T19:24:09Z) - Improving Document Retrieval Coherence for Semantically Equivalent Queries [63.97649988164166]
We propose a variation of the Multi-Negative Ranking loss for training DR that improves the coherence of models in retrieving the same documents.<n>The loss penalizes discrepancies between the top-k ranked documents retrieved for diverse but semantic equivalent queries.
arXiv Detail & Related papers (2025-08-11T13:34:59Z) - Options-Aware Dense Retrieval for Multiple-Choice query Answering [5.098112872671412]
Long-context multiple-choice question answering tasks require robust reasoning over extensive text sources.<n>Prior research in this domain has predominantly utilized pre-trained dense retrieval models.<n>This paper proposes a novel method called Options Aware Dense Retrieval (OADR) to address these challenges.
arXiv Detail & Related papers (2025-01-27T15:03:26Z) - Bridging Relevance and Reasoning: Rationale Distillation in Retrieval-Augmented Generation [45.366826050955105]
We propose RADIO, a novel and practical preference alignment framework with RAtionale DIstillatiOn.<n>We first propose a rationale extraction method that leverages the reasoning capabilities of Large Language Models (LLMs) to extract the rationales necessary for answering the query.<n> Subsequently, a rationale-based alignment process is designed to rerank the documents based on the extracted rationales, and fine-tune the reranker to align the preferences.
arXiv Detail & Related papers (2024-12-11T16:32:41Z) - Quam: Adaptive Retrieval through Query Affinity Modelling [15.3583908068962]
Building relevance models to rank documents based on user information needs is a central task in information retrieval and the NLP community.
We propose a unifying view of the nascent area of adaptive retrieval by proposing, Quam.
Our proposed approach, Quam improves the recall performance by up to 26% over the standard re-ranking baselines.
arXiv Detail & Related papers (2024-10-26T22:52:12Z) - Contrastive Learning to Improve Retrieval for Real-world Fact Checking [84.57583869042791]
We present Contrastive Fact-Checking Reranker (CFR), an improved retriever for fact-checking complex claims.
We leverage the AVeriTeC dataset, which annotates subquestions for claims with human written answers from evidence documents.
We find a 6% improvement in veracity classification accuracy on the dataset.
arXiv Detail & Related papers (2024-10-07T00:09:50Z) - Hybrid and Collaborative Passage Reranking [144.83902343298112]
We propose a Hybrid and Collaborative Passage Reranking (HybRank) method.
It incorporates the lexical and semantic properties of sparse and dense retrievers for reranking.
Built on off-the-shelf retriever features, HybRank is a plug-in reranker capable of enhancing arbitrary passage lists.
arXiv Detail & Related papers (2023-05-16T09:38:52Z) - LoL: A Comparative Regularization Loss over Query Reformulation Losses
for Pseudo-Relevance Feedback [70.44530794897861]
Pseudo-relevance feedback (PRF) has proven to be an effective query reformulation technique to improve retrieval accuracy.
Existing PRF methods independently treat revised queries originating from the same query but using different numbers of feedback documents.
We propose the Loss-over-Loss (LoL) framework to compare the reformulation losses between different revisions of the same query during training.
arXiv Detail & Related papers (2022-04-25T10:42:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.