Query-Efficient Agentic Graph Extraction Attacks on GraphRAG Systems
- URL: http://arxiv.org/abs/2601.14662v1
- Date: Wed, 21 Jan 2026 05:20:54 GMT
- Title: Query-Efficient Agentic Graph Extraction Attacks on GraphRAG Systems
- Authors: Shuhua Yang, Jiahao Zhang, Yilong Wang, Dongwon Lee, Suhang Wang,
- Abstract summary: Graph-based retrieval-augmented generation (GraphRAG) systems construct knowledge graphs over document collections to support multi-hop reasoning.<n>We study a budget-constrained black-box setting where an adversary adaptively queries the system to steal its latent entity-relation graph.<n>We propose AGEA, a framework that leverages a novelty-guided exploration-exploitation strategy, external graph memory modules, and a two-stage graph extraction pipeline.
- Score: 29.89127594311822
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Graph-based retrieval-augmented generation (GraphRAG) systems construct knowledge graphs over document collections to support multi-hop reasoning. While prior work shows that GraphRAG responses may leak retrieved subgraphs, the feasibility of query-efficient reconstruction of the hidden graph structure remains unexplored under realistic query budgets. We study a budget-constrained black-box setting where an adversary adaptively queries the system to steal its latent entity-relation graph. We propose AGEA (Agentic Graph Extraction Attack), a framework that leverages a novelty-guided exploration-exploitation strategy, external graph memory modules, and a two-stage graph extraction pipeline combining lightweight discovery with LLM-based filtering. We evaluate AGEA on medical, agriculture, and literary datasets across Microsoft-GraphRAG and LightRAG systems. Under identical query budgets, AGEA significantly outperforms prior attack baselines, recovering up to 90% of entities and relationships while maintaining high precision. These results demonstrate that modern GraphRAG systems are highly vulnerable to structured, agentic extraction attacks, even under strict query limits.
Related papers
- LinearRAG: Linear Graph Retrieval Augmented Generation on Large-scale Corpora [17.929144506419064]
Retrieval-Augmented Generation (RAG) is widely used to mitigate hallucinations of Large Language Models (LLMs) by leveraging external knowledge.<n>Existing graph-based RAG methods rely on unstable and costly relation extraction for graph construction.<n>We propose LinearRAG, an efficient framework that enables reliable graph construction and precise passage retrieval.
arXiv Detail & Related papers (2025-10-11T08:43:45Z) - G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge [88.82814893945077]
Large language models (LLMs) excel at complex reasoning but remain limited by static and incomplete parametric knowledge.<n>Recent graph-enhanced RAG (GraphRAG) attempts to bridge this gap by constructing tailored graphs and enabling LLMs to reason on them.<n>G-reasoner is a unified framework that integrates graph and language foundation models for reasoning over diverse graph-structured knowledge.
arXiv Detail & Related papers (2025-09-29T04:38:12Z) - Graph-R1: Towards Agentic GraphRAG Framework via End-to-end Reinforcement Learning [20.05893083101089]
Graph-R1 is an agentic GraphRAG framework via end-to-end reinforcement learning (RL)<n>It introduces lightweight knowledge hypergraph construction, models retrieval as a multi-turn agent-environment interaction.<n>Experiments on standard RAG datasets show that Graph-R1 outperforms traditional GraphRAG and RL-enhanced RAG methods in reasoning accuracy, retrieval efficiency, and generation quality.
arXiv Detail & Related papers (2025-07-29T15:01:26Z) - XGraphRAG: Interactive Visual Analysis for Graph-based Retrieval-Augmented Generation [16.068460356582648]
This research proposes a visual analysis framework that helps RAG developers identify critical recalls of GraphRAG.<n>We develop XGraphRAG, a prototype system incorporating a set of interactive visualizations to facilitate users' analysis process.
arXiv Detail & Related papers (2025-06-10T09:14:30Z) - When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation [31.930889441883732]
Graph retrieval-augmented generation (GraphRAG) has emerged as a powerful paradigm for enhancing large language models (LLMs) with external knowledge.<n>Recent studies report that GraphRAG frequently underperforms vanilla RAG on many real-world tasks.<n>This raises a critical question: Is GraphRAG really effective, and in which scenarios do graph structures provide measurable benefits for RAG systems?
arXiv Detail & Related papers (2025-06-06T02:37:47Z) - Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation [79.75818239774952]
Large language models (LLMs) have demonstrated remarkable capabilities, but still struggle with issues like hallucinations and outdated information.<n>Retrieval-augmented generation (RAG) addresses these issues by grounding LLM outputs in external knowledge with an Information Retrieval (IR) system.<n>We propose Align-GRAG, a novel reasoning-guided dual alignment framework in post-retrieval phrase.
arXiv Detail & Related papers (2025-05-22T05:15:27Z) - Divide by Question, Conquer by Agent: SPLIT-RAG with Question-Driven Graph Partitioning [62.640169289390535]
SPLIT-RAG is a multi-agent RAG framework that addresses the limitations with question-driven semantic graph partitioning and collaborative subgraph retrieval.<n>The innovative framework first create Semantic Partitioning of Linked Information, then use the Type-Specialized knowledge base to achieve Multi-Agent RAG.<n>The attribute-aware graph segmentation manages to divide knowledge graphs into semantically coherent subgraphs, ensuring subgraphs align with different query types.<n>A hierarchical merging module resolves inconsistencies across subgraph-derived answers through logical verifications.
arXiv Detail & Related papers (2025-05-20T06:44:34Z) - RGL: A Graph-Centric, Modular Framework for Efficient Retrieval-Augmented Generation on Graphs [58.10503898336799]
We introduce the RAG-on-Graphs Library (RGL), a modular framework that seamlessly integrates the complete RAG pipeline.<n>RGL addresses key challenges by supporting a variety of graph formats and integrating optimized implementations for essential components.<n>Our evaluations demonstrate that RGL not only accelerates the prototyping process but also enhances the performance and applicability of graph-based RAG systems.
arXiv Detail & Related papers (2025-03-25T03:21:48Z) - RAG vs. GraphRAG: A Systematic Evaluation and Key Insights [53.83444096699458]
We systematically evaluate Retrieval-Augmented Generation (RAG) and GraphRAG on text-based benchmarks.<n>Our results highlight the distinct strengths of RAG and GraphRAG across different tasks and evaluation perspectives.
arXiv Detail & Related papers (2025-02-17T02:36:30Z) - Retrieval-Augmented Generation with Graphs (GraphRAG) [84.29507404866257]
Retrieval-augmented generation (RAG) is a powerful technique that enhances downstream task execution by retrieving additional information.<n>Graph, by its intrinsic "nodes connected by edges" nature, encodes massive heterogeneous and relational information.<n>Unlike conventional RAG, the uniqueness of graph-structured data, such as diverse-formatted and domain-specific relational knowledge, poses unique and significant challenges when designing GraphRAG for different domains.
arXiv Detail & Related papers (2024-12-31T06:59:35Z) - Graph Retrieval-Augmented Generation: A Survey [28.979898837538958]
Retrieval-Augmented Generation (RAG) has achieved remarkable success in addressing the challenges of Large Language Models (LLMs) without necessitating retraining.
This paper provides the first comprehensive overview of GraphRAG methodologies.
We formalize the GraphRAG workflow, encompassing Graph-Based Indexing, Graph-Guided Retrieval, and Graph-Enhanced Generation.
arXiv Detail & Related papers (2024-08-15T12:20:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.