Related papers: Rerank Before You Reason: Analyzing Reranking Tradeoffs through Effective Token Cost in Deep Search Agents

Rerank Before You Reason: Analyzing Reranking Tradeoffs through Effective Token Cost in Deep Search Agents

URL: http://arxiv.org/abs/2601.14224v1
Date: Tue, 20 Jan 2026 18:38:35 GMT
Title: Rerank Before You Reason: Analyzing Reranking Tradeoffs through Effective Token Cost in Deep Search Agents
Authors: Sahel Sharifymoghaddam, Jimmy Lin,
Abstract summary: We study how to allocate reasoning budget in deep search pipelines.<n>Using the BrowseComp-Plus benchmark, we analyze tradeoffs between model scale, reasoning effort, reranking depth, and total token cost.
Score: 50.212640395029744
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep research agents rely on iterative retrieval and reasoning to answer complex queries, but scaling test-time computation raises significant efficiency concerns. We study how to allocate reasoning budget in deep search pipelines, focusing on the role of listwise reranking. Using the BrowseComp-Plus benchmark, we analyze tradeoffs between model scale, reasoning effort, reranking depth, and total token cost via a novel effective token cost (ETC) metric. Our results show that reranking consistently improves retrieval and end-to-end accuracy, and that moderate reranking often yields larger gains than increasing search-time reasoning, achieving comparable accuracy at substantially lower cost. All our code is available at https://github.com/texttron/BrowseComp-Plus.git

Related papers

Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization [64.61432234404276]
emphSearch More, Think Less (SMTL) is a framework for long-horizon agentic search that targets both efficiency and generalization.<n>We train an end-to-end agent using supervised fine-tuning and reinforcement learning, achieving strong and often state of the art performance across benchmarks.
arXiv Detail & Related papers (2026-02-26T06:46:41Z)
TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework [62.66056331998838]
TeaRAG is a token-efficient agentic RAG framework capable of compressing both retrieval content and reasoning steps.<n>Our reward function evaluates the knowledge sufficiency by a knowledge matching mechanism, while penalizing excessive reasoning steps.
arXiv Detail & Related papers (2025-11-07T16:08:34Z)
Your Dense Retriever is Secretly an Expeditious Reasoner [12.123445960145693]
We propose Adaptive Query Reasoning (AdaQR), a hybrid query rewriting framework.<n>AdaQR reduces reasoning cost by 28% while preserving-or even improving-retrieval performance by 7%.
arXiv Detail & Related papers (2025-09-27T16:50:03Z)
Retrieval-of-Thought: Efficient Reasoning via Reusing Thoughts [6.845529733164892]
We propose Retrieval-of-Thought (RoT), which reuses prior reasoning as composable thought" steps to guide new problems.<n>RoT organizes steps into a thought graph with sequential and semantic edges to enable fast retrieval and flexible recombination.<n>We evaluate RoT on reasoning benchmarks with multiple models, measuring accuracy, token usage, latency, and memory overhead.
arXiv Detail & Related papers (2025-09-26T01:17:35Z)
FrugalRAG: Learning to retrieve and reason for multi-hop QA [10.193015391271535]
Large-scale fine-tuning is not needed to improve RAG metrics.<n>Supervised and RL-based fine-tuning can help RAG from the perspective of frugality.
arXiv Detail & Related papers (2025-07-10T11:02:13Z)
A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings [60.48717743667377]
A*-Thought is an efficient tree search-based unified framework designed to identify and isolate the most essential thoughts.<n>It formulates the reasoning process of LRMs as a search tree, where each node represents a reasoning span in the giant reasoning space.<n>It can improve the performance of QwQ-32B by 2.39$times$ with low-budget and reduce the length of the output token by nearly 50% with high-budget.
arXiv Detail & Related papers (2025-05-30T12:58:34Z)
Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls [83.89771461061903]
Recent advancements in tree search algorithms guided by verifiers have significantly enhanced the reasoning capabilities of large language models (LLMs)<n>Recent advancements in tree search algorithms guided by verifiers have significantly enhanced the reasoning capabilities of large language models (LLMs)<n>We identify two key challenges contributing to this inefficiency: $textitover-exploration$ due to redundant states with semantically equivalent content, and $textitunder-exploration$ caused by high variance in verifier scoring.<n>We propose FETCH, a flexible, plug-and-play system compatible with various tree search algorithms.
arXiv Detail & Related papers (2025-02-16T16:12:01Z)
Self-Evaluation Guided Beam Search for Reasoning [61.523627290397556]
We introduce a stepwise self-evaluation mechanism to guide and calibrate the reasoning process of Large Language Model (LLM) We propose a decoding algorithm integrating the self-evaluation guidance via beam search. Our approach surpasses the corresponding Codex-backboned baselines in few-shot accuracy by $6.34%$, $9.56%$, and $5.46%$ on the GSM8K, AQuA, and StrategyQA.
arXiv Detail & Related papers (2023-05-01T02:37:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.