PECAN: LLM-Guided Dynamic Progress Control with Attention-Guided Hierarchical Weighted Graph for Long-Document QA
- URL: http://arxiv.org/abs/2410.04790v2
- Date: Sun, 08 Jun 2025 14:53:35 GMT
- Title: PECAN: LLM-Guided Dynamic Progress Control with Attention-Guided Hierarchical Weighted Graph for Long-Document QA
- Authors: Xinyu Wang, Yanzheng Xiang, Lin Gui, Yulan He,
- Abstract summary: Long-document QA presents challenges with large-scale text and long-distance dependencies.<n>Recent advances in Large Language Models (LLMs) enable entire documents to be processed in a single pass.<n>We propose a novel retrieval method that constructs a hierarchical graph where edges are derived by LLM attention weights.
- Score: 16.945257645760428
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Long-document QA presents challenges with large-scale text and long-distance dependencies. Recent advances in Large Language Models (LLMs) enable entire documents to be processed in a single pass. However, their computational cost is significantly high. Retrieval-Augmented Generation (RAG) methods split text into smaller chunks, but they often yield inferior results and may lose global context. Recent approaches that integrate LLMs into RAG via iterative summarization either underutilize LLM capabilities or still incur high computational costs. In this paper, we combine the high accuracy of LLMs with the efficiency of RAG and propose LLM-Guided Dynamic Progress Control with Attention-Based Hierarchical Weighted Graph (PECAN). Our method introduces two key improvements: (1) LLM-Guided Dynamic Progress Control: We leverage LLMs to dynamically control the retrieval process, adjusting the amount of retrieved information based on different queries to achieve a better balance of effectiveness and efficiency. (2) Attention-Guided Retrieval: We propose a novel retrieval method that constructs a hierarchical graph where edges are derived by LLM attention weights. Experimental results demonstrate that PECAN achieves LLM-level performance while maintaining computational complexity comparable to that of RAG methods on two single-document and two multi-document QA datasets.
Related papers
- Harnessing the Power of Reinforcement Learning for Language-Model-Based Information Retriever via Query-Document Co-Augmentation [35.70731674603417]
Large Language Models (LLMs) can be used to augment both user queries and corpus documents.<n>We present an LLM-based retriever empowered to augment both user queries and corpus documents.<n>Our approach significantly enhances LLM-based retrieval performance in both sparse and dense settings.
arXiv Detail & Related papers (2025-06-23T14:14:43Z) - Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques [14.892995952768352]
Language Models (LMs) have excelled at tasks like text generation, summarization, and question answering.<n>Their inference remains computationally expensive and energy intensive in settings with limited hardware, power, or bandwidth.<n>Recent approaches have introduced multi LLM intelligent model selection strategies that dynamically allocate computational resources based on query complexity.
arXiv Detail & Related papers (2025-06-06T23:13:08Z) - Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers [74.17516978246152]
Large language models (LLMs) have been widely integrated into information retrieval to advance traditional techniques.<n>We propose EXSEARCH, an agentic search framework, where the LLM learns to retrieve useful information as the reasoning unfolds.<n>Experiments on four knowledge-intensive benchmarks show that EXSEARCH substantially outperforms baselines.
arXiv Detail & Related papers (2025-05-26T15:27:55Z) - Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Search [65.53881294642451]
Deliberate Thinking based Dense Retriever (DEBATER)
DEBATER enhances recent dense retrievers by enabling them to learn more effective document representations through a step-by-step thinking process.
Experimental results show that DEBATER significantly outperforms existing methods across several retrieval benchmarks.
arXiv Detail & Related papers (2025-02-18T15:56:34Z) - PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths [42.01377074786958]
Retrieval-augmented generation (RAG) improves the response quality of large language models (LLMs) by retrieving knowledge from external databases.
We propose PathRAG, which retrieves key relational paths from the indexing graph, and converts these paths into textual form for prompting LLMs.
PathRAG consistently outperforms state-of-the-art baselines across six datasets and five evaluation dimensions.
arXiv Detail & Related papers (2025-02-18T11:18:55Z) - LaRA: Benchmarking Retrieval-Augmented Generation and Long-Context LLMs -- No Silver Bullet for LC or RAG Routing [70.35888047551643]
We present LaRA, a novel benchmark specifically designed to rigorously compare RAG and LC LLMs.<n>LaRA encompasses 2326 test cases across four practical QA task categories and three types of naturally occurring long texts.<n>We find that the optimal choice between RAG and LC depends on a complex interplay of factors, including the model's parameter size, long-text capabilities, context length, task type, and the characteristics of the retrieved chunks.
arXiv Detail & Related papers (2025-02-14T08:04:22Z) - Multiple Abstraction Level Retrieve Augment Generation [4.516242893120263]
A Retrieval-Augmented Generation (RAG) model powered by a large language model (LLM) provides a faster and more cost-effective solution for adapting to new data and knowledge.
We propose a novel RAG approach that uses chunks of multiple abstraction levels (MAL), including multi-sentence-level, paragraph-level, section-level, and document-level.
Compared to traditional single-level RAG approaches, our approach improves AI evaluated answer correctness of Q/A by 25.739% on Glyco-related papers.
arXiv Detail & Related papers (2025-01-28T13:49:39Z) - Invar-RAG: Invariant LLM-aligned Retrieval for Better Generation [43.630437906898635]
We propose a novel two-stage fine-tuning architecture called Invar-RAG.
In the retrieval stage, an LLM-based retriever is constructed by integrating LoRA-based representation learning.
In the generation stage, a refined fine-tuning method is employed to improve LLM accuracy in generating answers based on retrieved information.
arXiv Detail & Related papers (2024-11-11T14:25:37Z) - Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs [12.878608250420832]
We propose textitgraph of records (textbfGoR) to enhance RAG for long-context global summarization.
Inspired by the textitretrieve-then-generate paradigm of RAG, we construct a graph by establishing an edge between the retrieved text chunks and the corresponding LLM-generated response.
To further uncover the intricate correlations between them, GoR features a textitgraph neural network and an elaborately designed textitBERTScore-based objective for self-supervised model training.
arXiv Detail & Related papers (2024-10-14T18:34:29Z) - VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents [66.42579289213941]
Retrieval-augmented generation (RAG) is an effective technique that enables large language models to utilize external knowledge sources for generation.
In this paper, we introduce VisRAG, which tackles this issue by establishing a vision-language model (VLM)-based RAG pipeline.
In this pipeline, instead of first parsing the document to obtain text, the document is directly embedded using a VLM as an image and then retrieved to enhance the generation of a VLM.
arXiv Detail & Related papers (2024-10-14T15:04:18Z) - Less is More: Making Smaller Language Models Competent Subgraph Retrievers for Multi-hop KGQA [51.3033125256716]
We model the subgraph retrieval task as a conditional generation task handled by small language models.
Our base generative subgraph retrieval model, consisting of only 220M parameters, competitive retrieval performance compared to state-of-the-art models.
Our largest 3B model, when plugged with an LLM reader, sets new SOTA end-to-end performance on both the WebQSP and CWQ benchmarks.
arXiv Detail & Related papers (2024-10-08T15:22:36Z) - Control Large Language Models via Divide and Conquer [94.48784966256463]
This paper investigates controllable generation for large language models (LLMs) with prompt-based control, focusing on Lexically Constrained Generation (LCG)
We evaluate the performance of LLMs on satisfying lexical constraints with prompt-based control, as well as their efficacy in downstream applications.
arXiv Detail & Related papers (2024-10-06T21:20:06Z) - PAGED: A Benchmark for Procedural Graphs Extraction from Documents [14.769780429159912]
We propose a new benchmark PAGED equipped with a large high-quality dataset and standard evaluations.
It investigates five state-of-the-art baselines, revealing that they fail to extract optimal procedural graphs well because of their heavy reliance on hand-written rules and limited available data.
The results point out the advantages of LLMs in identifying textual elements and their gaps in building logical structures.
arXiv Detail & Related papers (2024-08-07T08:43:18Z) - DARG: Dynamic Evaluation of Large Language Models via Adaptive Reasoning Graph [70.79413606968814]
We introduce Dynamic Evaluation of LLMs via Adaptive Reasoning Graph Evolvement (DARG) to dynamically extend current benchmarks with controlled complexity and diversity.
Specifically, we first extract the reasoning graphs of data points in current benchmarks and then perturb the reasoning graphs to generate novel testing data.
Such newly generated test samples can have different levels of complexity while maintaining linguistic diversity similar to the original benchmarks.
arXiv Detail & Related papers (2024-06-25T04:27:53Z) - Learning to Plan for Retrieval-Augmented Large Language Models from Knowledge Graphs [59.76268575344119]
We introduce a novel framework for enhancing large language models' (LLMs) planning capabilities by using planning data derived from knowledge graphs (KGs)
LLMs fine-tuned with KG data have improved planning capabilities, better equipping them to handle complex QA tasks that involve retrieval.
arXiv Detail & Related papers (2024-06-20T13:07:38Z) - LinkGPT: Teaching Large Language Models To Predict Missing Links [23.57145845001286]
Large Language Models (LLMs) have shown promising results on various language and vision tasks.
Recently, there has been growing interest in applying LLMs to graph-based tasks, particularly on Text-Attributed Graphs (TAGs)
arXiv Detail & Related papers (2024-06-07T04:54:36Z) - Document-Level In-Context Few-Shot Relation Extraction via Pre-Trained Language Models [29.94694305204144]
We present a novel framework for document-level in-context few-shot relation extraction.
We evaluate our framework using DocRED, the largest publicly available dataset for document-level relation extraction.
arXiv Detail & Related papers (2023-10-17T09:10:27Z) - Zero-Shot Listwise Document Reranking with a Large Language Model [58.64141622176841]
We propose Listwise Reranker with a Large Language Model (LRL), which achieves strong reranking effectiveness without using any task-specific training data.
Experiments on three TREC web search datasets demonstrate that LRL not only outperforms zero-shot pointwise methods when reranking first-stage retrieval results, but can also act as a final-stage reranker.
arXiv Detail & Related papers (2023-05-03T14:45:34Z) - Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes [54.13559879916708]
EVAPORATE is a prototype system powered by large language models (LLMs)<n>Code synthesis is cheap, but far less accurate than directly processing each document with the LLM.<n>We propose an extended code implementation, EVAPORATE-CODE+, which achieves better quality than direct extraction.
arXiv Detail & Related papers (2023-04-19T06:00:26Z) - Towards Better Dynamic Graph Learning: New Architecture and Unified
Library [29.625205125350313]
DyGFormer is a Transformer-based architecture for dynamic graph learning.
DyGLib is a unified library with standard training pipelines and coding interfaces.
arXiv Detail & Related papers (2023-03-23T05:27:32Z) - Query2doc: Query Expansion with Large Language Models [69.9707552694766]
The proposed method first generates pseudo- documents by few-shot prompting large language models (LLMs)
query2doc boosts the performance of BM25 by 3% to 15% on ad-hoc IR datasets.
Our method also benefits state-of-the-art dense retrievers in terms of both in-domain and out-of-domain results.
arXiv Detail & Related papers (2023-03-14T07:27:30Z) - DORE: Document Ordered Relation Extraction based on Generative Framework [56.537386636819626]
This paper investigates the root cause of the underwhelming performance of the existing generative DocRE models.
We propose to generate a symbolic and ordered sequence from the relation matrix which is deterministic and easier for model to learn.
Experimental results on four datasets show that our proposed method can improve the performance of the generative DocRE models.
arXiv Detail & Related papers (2022-10-28T11:18:10Z) - Autoregressive Search Engines: Generating Substrings as Document
Identifiers [53.0729058170278]
Autoregressive language models are emerging as the de-facto standard for generating answers.
Previous work has explored ways to partition the search space into hierarchical structures.
In this work we propose an alternative that doesn't force any structure in the search space: using all ngrams in a passage as its possible identifiers.
arXiv Detail & Related papers (2022-04-22T10:45:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.