Optimizing Multi-Hop Document Retrieval Through Intermediate Representations
- URL: http://arxiv.org/abs/2503.04796v2
- Date: Sat, 31 May 2025 09:02:25 GMT
- Title: Optimizing Multi-Hop Document Retrieval Through Intermediate Representations
- Authors: Jiaen Lin, Jingyu Liu, Yingbo Liu,
- Abstract summary: Retrieval-augmented generation (RAG) encounters challenges when addressing complex queries, particularly multi-hop questions.<n>We propose Layer-wise RAG (L-RAG), which leverages intermediate representations from the middle layers, which capture next-hop information, to retrieve external knowledge.<n> Experimental results show that L-RAG outperforms existing RAG methods on open-domain multi-hop question-answering datasets.
- Score: 1.99038892363306
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Retrieval-augmented generation (RAG) encounters challenges when addressing complex queries, particularly multi-hop questions. While several methods tackle multi-hop queries by iteratively generating internal queries and retrieving external documents, these approaches are computationally expensive. In this paper, we identify a three-stage information processing pattern in LLMs during layer-by-layer reasoning, consisting of extraction, processing, and subsequent extraction steps. This observation suggests that the representations in intermediate layers contain richer information compared to those in other layers. Building on this insight, we propose Layer-wise RAG (L-RAG). Unlike prior methods that focus on generating new internal queries, L-RAG leverages intermediate representations from the middle layers, which capture next-hop information, to retrieve external knowledge. L-RAG achieves performance comparable to multi-step approaches while maintaining inference overhead similar to that of standard RAG. Experimental results show that L-RAG outperforms existing RAG methods on open-domain multi-hop question-answering datasets, including MuSiQue, HotpotQA, and 2WikiMultiHopQA. The code is available in https://anonymous.4open.science/r/L-RAG-ADD5/
Related papers
- Comprehensive Comparison of RAG Methods Across Multi-Domain Conversational QA [18.46710400838861]
This paper addresses the lack of a systematic comparison of RAG methods for multi-turn conversational QA.<n>We present a comprehensive empirical study of vanilla and advanced RAG methods across eight diverse conversational QA datasets.<n>Our results show that robust yet straightforward methods, such as reranking, hybrid BM25, and HyDE, consistently outperform vanilla RAG.
arXiv Detail & Related papers (2026-02-10T08:59:23Z) - Q-RAG: Long Context Multi-step Retrieval via Value-based Embedder Training [50.37345200692884]
We propose Q-RAG, a novel approach that fine-tunes Embedder model for multi-step retrieval using reinforcement learning (RL)<n>Q-RAG offers a competitive, resource-efficient alternative to existing multi-step retrieval methods for open-domain question answering.
arXiv Detail & Related papers (2025-11-10T17:31:02Z) - Transforming Questions and Documents for Semantically Aligned Retrieval-Augmented Generation [1.223779595809275]
We introduce a novel retrieval-augmented generation (RAG) framework tailored for multihop question answering.<n>Our system uses large language model (LLM) to decompose complex multihop questions into a sequence of single-hop subquestions that guide document retrieval.<n>Instead of embedding raw or chunked documents directly, we generate answerable questions from each document chunk using Qwen3-8B, embed these generated questions, and retrieve relevant chunks via question-question embedding similarity.
arXiv Detail & Related papers (2025-08-13T12:35:04Z) - DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router [57.28685457991806]
DeepSieve is an agentic RAG framework that incorporates information sieving via LLM-as-a-knowledge-router.<n>Our design emphasizes modularity, transparency, and adaptability, leveraging recent advances in agentic system design.
arXiv Detail & Related papers (2025-07-29T17:55:23Z) - Hierarchical Lexical Graph for Enhanced Multi-Hop Retrieval [22.33550491040999]
RAG grounds large language models in external evidence, yet it still falters when answers must be pieced together across semantically distant documents.<n>We build two plug-and-play retrievers: StatementGraphRAG and TopicGraphRAG.<n>Our methods outperform naive chunk-based RAG achieving an average relative improvement of 23.1% in retrieval recall and correctness.
arXiv Detail & Related papers (2025-06-09T17:58:35Z) - R3-RAG: Learning Step-by-Step Reasoning and Retrieval for LLMs via Reinforcement Learning [62.742230250513025]
Retrieval-Augmented Generation (RAG) integrates external knowledge with Large Language Models (LLMs) to enhance factual correctness and hallucination.<n>We propose $textbfR3-RAG$, which uses $textbfR$einforcement learning to make the LLM learn how to $textbfR$eason and $textbfR$etrieve step by step, thus retrieving comprehensive external knowledge and leading to correct answers.
arXiv Detail & Related papers (2025-05-26T12:25:37Z) - Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning [62.640169289390535]
SPLIT-RAG is a multi-agent RAG framework that addresses the limitations with question-driven semantic graph partitioning and collaborative subgraph retrieval.<n>The innovative framework first create Semantic Partitioning of Linked Information, then use the Type-Specialized knowledge base to achieve Multi-Agent RAG.<n>The attribute-aware graph segmentation manages to divide knowledge graphs into semantically coherent subgraphs, ensuring subgraphs align with different query types.<n>A hierarchical merging module resolves inconsistencies across subgraph-derived answers through logical verifications.
arXiv Detail & Related papers (2025-05-20T06:44:34Z) - LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs -- No Silver Bullet for LC or RAG Routing [70.35888047551643]
We present LaRA, a novel benchmark specifically designed to rigorously compare RAG and LC LLMs.<n>LaRA encompasses 2326 test cases across four practical QA task categories and three types of naturally occurring long texts.<n>We find that the optimal choice between RAG and LC depends on a complex interplay of factors, including the model's parameter size, long-text capabilities, context length, task type, and the characteristics of the retrieved chunks.
arXiv Detail & Related papers (2025-02-14T08:04:22Z) - Can we Retrieve Everything All at Once? ARM: An Alignment-Oriented LLM-based Retrieval Method [48.14236175156835]
ARM aims to better align the question with the organization of the data collection by exploring relationships among data objects.<n>It outperforms standard RAG with query decomposition by up to 5.2 pt in execution accuracy and agentic RAG (ReAct) by up to 15.9 pt.<n>It achieves up to 5.5 pt and 19.3 pt higher F1 match scores compared to these approaches.
arXiv Detail & Related papers (2025-01-30T18:07:19Z) - Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent [92.57125498367907]
Multimodal Retrieval Augmented Generation (mRAG) plays an important role in mitigating the "hallucination" issue inherent in multimodal large language models (MLLMs)
We propose the first self-adaptive planning agent for multimodal retrieval, OmniSearch.
arXiv Detail & Related papers (2024-11-05T09:27:21Z) - BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression [91.23933111083389]
Retrieval-augmented generation (RAG) can supplement large language models (LLMs) by integrating external knowledge.<n>This paper presents BRIEF, a lightweight approach that performs query-aware multi-hop reasoning.<n>Based on our synthetic data built entirely by open-source models, BRIEF generates more concise summaries.
arXiv Detail & Related papers (2024-10-20T04:24:16Z) - Enhancing Long Context Performance in LLMs Through Inner Loop Query Mechanism [2.919891871101241]
Transformers have a quadratic scaling of computational complexity with input size.
Retrieval-augmented generation (RAG) can better handle longer contexts by using a retrieval system.
We introduce a novel approach, Inner Loop Memory Augmented Tree Retrieval (ILM-TR)
arXiv Detail & Related papers (2024-10-11T19:49:05Z) - EfficientRAG: Efficient Retriever for Multi-Hop Question Answering [52.64500643247252]
We introduce EfficientRAG, an efficient retriever for multi-hop question answering.
Experimental results demonstrate that EfficientRAG surpasses existing RAG methods on three open-domain multi-hop question-answering datasets.
arXiv Detail & Related papers (2024-08-08T06:57:49Z) - Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach [6.549143816134531]
We propose a novel iterative RAG method called ReSP, equipped with a dual-function summarizer.<n> Experimental results on the multi-hop question-answering HotpotQA and 2WikiMultihopQA demonstrate that our method significantly outperforms the state-of-the-art.
arXiv Detail & Related papers (2024-07-18T02:19:00Z) - Multi-Head RAG: Solving Multi-Aspect Problems with LLMs [13.638439488923671]
Retrieval Augmented Generation (RAG) enhances the abilities of Large Language Models (LLMs)
Existing RAG solutions do not focus on queries that may require fetching multiple documents with substantially different contents.
This paper introduces Multi-Head RAG (MRAG), a novel scheme designed to address this gap with a simple yet powerful idea.
arXiv Detail & Related papers (2024-06-07T16:59:38Z) - MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop
Queries [22.4349439498591]
Retrieval-augmented generation (RAG) augments large language models (LLM) by retrieving relevant knowledge.
Existing RAG systems are inadequate in answering multi-hop queries, which require retrieving and reasoning over multiple pieces of supporting evidence.
We develop a novel dataset, MultiHop-RAG, which consists of a knowledge base, a large collection of multi-hop queries, their ground-truth answers, and the associated supporting evidence.
arXiv Detail & Related papers (2024-01-27T11:41:48Z) - Answering Any-hop Open-domain Questions with Iterative Document
Reranking [62.76025579681472]
We propose a unified QA framework to answer any-hop open-domain questions.
Our method consistently achieves performance comparable to or better than the state-of-the-art on both single-hop and multi-hop open-domain QA datasets.
arXiv Detail & Related papers (2020-09-16T04:31:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.