TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework
- URL: http://arxiv.org/abs/2511.05385v1
- Date: Fri, 07 Nov 2025 16:08:34 GMT
- Title: TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework
- Authors: Chao Zhang, Yuhao Wang, Derong Xu, Haoxin Zhang, Yuanjie Lyu, Yuhao Chen, Shuochen Liu, Tong Xu, Xiangyu Zhao, Yan Gao, Yao Hu, Enhong Chen,
- Abstract summary: TeaRAG is a token-efficient agentic RAG framework capable of compressing both retrieval content and reasoning steps.<n>Our reward function evaluates the knowledge sufficiency by a knowledge matching mechanism, while penalizing excessive reasoning steps.
- Score: 62.66056331998838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Retrieval-Augmented Generation (RAG) utilizes external knowledge to augment Large Language Models' (LLMs) reliability. For flexibility, agentic RAG employs autonomous, multi-round retrieval and reasoning to resolve queries. Although recent agentic RAG has improved via reinforcement learning, they often incur substantial token overhead from search and reasoning processes. This trade-off prioritizes accuracy over efficiency. To address this issue, this work proposes TeaRAG, a token-efficient agentic RAG framework capable of compressing both retrieval content and reasoning steps. 1) First, the retrieved content is compressed by augmenting chunk-based semantic retrieval with a graph retrieval using concise triplets. A knowledge association graph is then built from semantic similarity and co-occurrence. Finally, Personalized PageRank is leveraged to highlight key knowledge within this graph, reducing the number of tokens per retrieval. 2) Besides, to reduce reasoning steps, Iterative Process-aware Direct Preference Optimization (IP-DPO) is proposed. Specifically, our reward function evaluates the knowledge sufficiency by a knowledge matching mechanism, while penalizing excessive reasoning steps. This design can produce high-quality preference-pair datasets, supporting iterative DPO to improve reasoning conciseness. Across six datasets, TeaRAG improves the average Exact Match by 4% and 2% while reducing output tokens by 61% and 59% on Llama3-8B-Instruct and Qwen2.5-14B-Instruct, respectively. Code is available at https://github.com/Applied-Machine-Learning-Lab/TeaRAG.
Related papers
- ProGraph-R1: Progress-aware Reinforcement Learning for Graph Retrieval Augmented Generation [37.11787010202267]
We propose ProGraph-R1, a progress-aware agentic framework for graph-based retrieval and multi-step reasoning.<n>ProGraph-R1 introduces a structure-aware hypergraph retrieval mechanism that jointly considers semantic relevance and graph connectivity.<n> Experiments on multi-hop question answering benchmarks demonstrate that ProGraph-R1 consistently improves reasoning accuracy and generation quality over existing GraphRAG methods.
arXiv Detail & Related papers (2026-01-25T08:58:44Z) - PROPEX-RAG: Enhanced GraphRAG using Prompt-Driven Prompt Execution [4.1390735746263685]
We present a prompt-driven GraphRAG framework that underscores the significance of prompt formulation in facilitating entity extraction, fact selection, and passage reranking.<n>Our system gets state-of-the-art performance on HotpotQA and 2WikiMultiHopQA, with F1 scores of 80.7% and 78.9%, and Recall@5 scores of 97.1% and 98.1%, respectively.
arXiv Detail & Related papers (2025-11-03T18:00:56Z) - GRIL: Knowledge Graph Retrieval-Integrated Learning with Large Language Models [59.72897499248909]
We propose a novel graph retriever trained end-to-end with Large Language Models (LLMs)<n>Within the extracted subgraph, structural knowledge and semantic features are encoded via soft tokens and the verbalized graph, respectively, which are infused into the LLM together.<n>Our approach consistently achieves state-of-the-art performance, validating the strength of joint graph-LLM optimization for complex reasoning tasks.
arXiv Detail & Related papers (2025-09-20T02:38:00Z) - Clue-RAG: Towards Accurate and Cost-Efficient Graph-based RAG via Multi-Partite Graph and Query-Driven Iterative Retrieval [15.599544326509436]
Retrieval-Augmented Generation (RAG) addresses the limitation by incorporating external information, often from graph-structured data.<n>We propose Clue-RAG, a novel approach that introduces a multi-partite graph index and a query-driven iterative retrieval strategy.<n>Experiments on three QA benchmarks show that Clue-RAG significantly outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2025-07-11T09:36:45Z) - FrugalRAG: Learning to retrieve and reason for multi-hop QA [10.193015391271535]
Large-scale fine-tuning is not needed to improve RAG metrics.<n>Supervised and RL-based fine-tuning can help RAG from the perspective of frugality.
arXiv Detail & Related papers (2025-07-10T11:02:13Z) - Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering [75.12322966980003]
Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains.<n>Most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning.<n>Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering.<n>We propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA.
arXiv Detail & Related papers (2025-06-11T12:03:52Z) - Reinforcing Video Reasoning with Focused Thinking [65.85683941058916]
We propose TW-GRPO, a novel framework that enhances visual reasoning with focused thinking and dense reward granularity.<n>Specifically, we employ a token weighting mechanism that prioritizes tokens with high informational density.<n>We also reformulate RL training by shifting from single-choice to multi-choice QA tasks.
arXiv Detail & Related papers (2025-05-30T15:42:19Z) - Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation [79.75818239774952]
Large language models (LLMs) have demonstrated remarkable capabilities, but still struggle with issues like hallucinations and outdated information.<n>Retrieval-augmented generation (RAG) addresses these issues by grounding LLM outputs in external knowledge with an Information Retrieval (IR) system.<n>We propose Align-GRAG, a novel reasoning-guided dual alignment framework in post-retrieval phrase.
arXiv Detail & Related papers (2025-05-22T05:15:27Z) - Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning [62.640169289390535]
SPLIT-RAG is a multi-agent RAG framework that addresses the limitations with question-driven semantic graph partitioning and collaborative subgraph retrieval.<n>The innovative framework first create Semantic Partitioning of Linked Information, then use the Type-Specialized knowledge base to achieve Multi-Agent RAG.<n>The attribute-aware graph segmentation manages to divide knowledge graphs into semantically coherent subgraphs, ensuring subgraphs align with different query types.<n>A hierarchical merging module resolves inconsistencies across subgraph-derived answers through logical verifications.
arXiv Detail & Related papers (2025-05-20T06:44:34Z) - ROGRAG: A Robustly Optimized GraphRAG Framework [45.947928801693266]
Graph-based retrieval-augmented generation (GraphRAG) addresses this by structuring domain knowledge as a graph for dynamic retrieval.<n>Existing pipelines involve complex engineering, making it difficult to isolate the impact of individual components.<n>We introduce ROGRAG, a Robustly Optimized GraphRAG framework, which integrates dual-level with logic form retrieval methods to improve robustness without increasing computational cost.
arXiv Detail & Related papers (2025-03-09T06:20:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.