Related papers: Retrieval-in-the-Chain: Bootstrapping Large Language Models for Generative Retrieval

Retrieval-in-the-Chain: Bootstrapping Large Language Models for Generative Retrieval

URL: http://arxiv.org/abs/2510.13095v2
Date: Tue, 21 Oct 2025 05:27:45 GMT
Title: Retrieval-in-the-Chain: Bootstrapping Large Language Models for Generative Retrieval
Authors: Yingchen Zhang, Ruqing Zhang, Jiafeng Guo, Wenjun Peng, Sen Li, Fuyu Lv,
Abstract summary: We propose Reason-for-Retrieval (R4R), a reasoning-augmented framework for Generative retrieval (GR)<n>R4R converts free-form chain-of-thought (CoT) reasoning into a compact, structured format, and iteratively refines the reasoning during the retrieval process.<n>Extensive experiments on Natural Questions, MS MARCO, and a real-world item-search benchmark validate the effectiveness of R4R.
Score: 40.35703097974511
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generative retrieval (GR) is an emerging paradigm that leverages large language models (LLMs) to autoregressively generate document identifiers (docids) relevant to a given query. Prior works have focused on leveraging the generative capabilities of LLMs to improve GR, while overlooking that their reasoning capabilities could likewise help. This raises a key question: Can explicit reasoning benefit GR? To investigate, we first conduct a preliminary study where an LLM is prompted to generate free-form chain-of-thought (CoT) reasoning before performing constrained docid decoding. Although this method outperforms standard GR, the generated reasoning tends to be verbose and poorly aligned with the docid space. These limitations motivate the development of a reasoning mechanism better tailored to GR. Therefore, we propose Reason-for-Retrieval (R4R), a reasoning-augmented framework for GR that converts free-form CoT reasoning into a compact, structured format, and iteratively refines the reasoning during the retrieval process. R4R augments an existing GR method by leveraging a reasoning-capable LLM that has been instruction-tuned for GR. At inference time, R4R first uses the LLM to generate an initial structured reasoning; then the same LLM alternates between (i) constrained decoding with the chosen GR method to produce candidate docids and (ii) updating the reasoning based on retrieval results to improve the next round. R4R does not require additional models or training, and instead a single LLM serves as both the reasoning generator and the retriever. Extensive experiments on Natural Questions, MS MARCO, and a real-world item-search benchmark validate the effectiveness of R4R.

Related papers

Reinforced Efficient Reasoning via Semantically Diverse Exploration [73.41112984160992]
Reinforcement learning with verifiable rewards (RLVR) has proven effective in enhancing the reasoning of large language models (LLMs)<n>We propose reinforced efficient reasoning via semantically diverse explorations, i.e., ROSE, for LLMs.<n>Our method incorporates a semantic-entropy-based branching strategy and an $varepsilon$-exploration mechanism.
arXiv Detail & Related papers (2026-01-08T15:56:44Z)
QUESTER: Query Specification for Generative Retrieval [28.47849228972565]
Generative Retrieval (GR) differs from the traditional index-then-retrieve pipeline by storing relevance in model parameters.<n>We introduce QUESTER (QUEry SpecificaTion gEnerative Retrieval), which reframes GR as query specification generation.
arXiv Detail & Related papers (2025-11-07T15:01:38Z)
Rethinking On-policy Optimization for Query Augmentation [49.87723664806526]
We present the first systematic comparison of prompting-based and RL-based query augmentation across diverse benchmarks.<n>We introduce a novel hybrid method, On-policy Pseudo-document Query Expansion (OPQE), which learns to generate a pseudo-document that maximizes retrieval performance.
arXiv Detail & Related papers (2025-10-20T04:16:28Z)
ZeroGR: A Generalizable and Scalable Framework for Zero-Shot Generative Retrieval [125.19156877994612]
Generative retrieval (GR) reformulates information retrieval (IR) by framing it as the generation of document identifiers (docids)<n>We propose textscZeroGR, a zero-shot generative retrieval framework that leverages natural language instructions to extend GR across a wide range of IR tasks.<n>Specifically, textscZeroGR is composed of three key components: (i) an LM-based docid generator that unifies heterogeneous documents into semantically meaningful docids; (ii) an instruction-tuned query generator that generates diverse types of queries from natural language task descriptions to enhance
arXiv Detail & Related papers (2025-10-12T03:04:24Z)
R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning [62.742230250513025]
Retrieval-Augmented Generation (RAG) integrates external knowledge with Large Language Models (LLMs) to enhance factual correctness and hallucination.<n>We propose $textbfR3-RAG$, which uses $textbfR$einforcement learning to make the LLM learn how to $textbfR$eason and $textbfR$etrieve step by step, thus retrieving comprehensive external knowledge and leading to correct answers.
arXiv Detail & Related papers (2025-05-26T12:25:37Z)
Alleviating LLM-based Generative Retrieval Hallucination in Alipay Search [14.769809465812587]
Generative retrieval (GR) has revolutionized document retrieval with the advent of large language models (LLMs)<n>We propose an optimized GR framework designed to alleviate retrieval hallucination.<n>We employ LLMs to assess and reason GR retrieved query-document (q-d) pairs, and then distill the reasoning data as transferred knowledge to the GR model.
arXiv Detail & Related papers (2025-03-27T02:36:48Z)
MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search [61.11836311160951]
We introduce MCTS-RAG, a novel approach that enhances the reasoning capabilities of small language models on knowledge-intensive tasks.<n>Unlike standard RAG methods, which typically retrieve information independently from reasoning, MCTS-RAG combines structured reasoning with adaptive retrieval.<n>This integrated approach enhances decision-making, reduces hallucinations, and ensures improved factual accuracy and response consistency.
arXiv Detail & Related papers (2025-03-26T17:46:08Z)
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning [50.419872452397684]
Search-R1 is an extension of reinforcement learning for reasoning frameworks.<n>It generates search queries during step-by-step reasoning with real-time retrieval.<n>It improves performance by 41% (Qwen2.5-7B) and 20% (Qwen2.5-3B) over various RAG baselines.
arXiv Detail & Related papers (2025-03-12T16:26:39Z)
Large Language Model Can Be a Foundation for Hidden Rationale-Based Retrieval [12.83513794686623]
In this paper, we propose and study a more challenging type of retrieval task, called hidden rationale retrieval.<n>To address such problems, an instruction-tuned Large language model (LLM) with a cross-encoder architecture could be a reasonable choice.<n>We name this retrieval framework by RaHoRe and verify its zero-shot and fine-tuned performance superiority on Emotional Support Conversation (ESC)
arXiv Detail & Related papers (2024-12-21T13:19:15Z)
RAG-Star: Enhancing Deliberative Reasoning with Retrieval Augmented Verification and Refinement [85.08223786819532]
Existing large language models (LLMs) show exceptional problem-solving capabilities but might struggle with complex reasoning tasks.<n>We propose textbfRAG-Star, a novel RAG approach that integrates retrieved information to guide the tree-based deliberative reasoning process.<n>Our experiments involving Llama-3.1-8B-Instruct and GPT-4o demonstrate that RAG-Star significantly outperforms previous RAG and reasoning methods.
arXiv Detail & Related papers (2024-12-17T13:05:36Z)
RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models [13.478123641238277]
RARE (Retrieval-Augmented Reasoning Enhancement) is a versatile extension to the mutual reasoning framework (rStar)<n>It aims at enhancing reasoning accuracy and factual integrity across large language models (LLMs) for complex, knowledge-intensive tasks such as commonsense and medical reasoning.
arXiv Detail & Related papers (2024-12-03T20:52:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.