Related papers: Efficient Knowledge Graph Construction and Retrieval from Unstructured Text for Large-Scale RAG Systems

Efficient Knowledge Graph Construction and Retrieval from Unstructured Text for Large-Scale RAG Systems

URL: http://arxiv.org/abs/2507.03226v1
Date: Fri, 04 Jul 2025 00:05:55 GMT
Title: Efficient Knowledge Graph Construction and Retrieval from Unstructured Text for Large-Scale RAG Systems
Authors: Congmin Min, Rhea Mathew, Joyce Pan, Sahil Bansal, Abbas Keshavarzi, Amar Viswanathan Kannan,
Abstract summary: We propose a scalable and cost-efficient framework for deploying Graph-based Retrieval Augmented Generation (GraphRAG) in enterprise environments.<n>These results validate the feasibility of deploying GraphRAG systems in real-world, large-scale enterprise applications.
Score: 0.06597195879147556
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We propose a scalable and cost-efficient framework for deploying Graph-based Retrieval Augmented Generation (GraphRAG) in enterprise environments. While GraphRAG has shown promise for multi-hop reasoning and structured retrieval, its adoption has been limited by the high computational cost of constructing knowledge graphs using large language models (LLMs) and the latency of graph-based retrieval. To address these challenges, we introduce two core innovations: (1) a dependency-based knowledge graph construction pipeline that leverages industrial-grade NLP libraries to extract entities and relations from unstructured text completely eliminating reliance on LLMs; and (2) a lightweight graph retrieval strategy that combines hybrid query node identification with efficient one-hop traversal for high-recall, low-latency subgraph extraction. We evaluate our framework on two SAP datasets focused on legacy code migration and demonstrate strong empirical performance. Our system achieves up to 15% and 4.35% improvements over traditional RAG baselines based on LLM-as-Judge and RAGAS metrics, respectively. Moreover, our dependency-based construction approach attains 94% of the performance of LLM-generated knowledge graphs (61.87% vs. 65.83%) while significantly reducing cost and improving scalability. These results validate the feasibility of deploying GraphRAG systems in real-world, large-scale enterprise applications without incurring prohibitive resource requirements paving the way for practical, explainable, and domain-adaptable retrieval-augmented reasoning.

Related papers

GraphRAG-R1: Graph Retrieval-Augmented Generation with Process-Constrained Reinforcement Learning [33.57411612551111]
We propose GraphRAG-R1, an adaptive GraphRAG framework by training LLMs with process-constrained outcome-based reinforcement learning (RL)<n>Our method can decompose complex problems, autonomously invoke retrieval tools, and perform effective reasoning.<n>Our framework can be flexibly integrated with various existing retrieval methods, consistently delivering performance improvements.
arXiv Detail & Related papers (2025-07-31T14:11:16Z)
Learning Efficient and Generalizable Graph Retriever for Knowledge-Graph Question Answering [75.12322966980003]
Large Language Models (LLMs) have shown strong inductive reasoning ability across various domains.<n>Most existing RAG pipelines rely on unstructured text, limiting interpretability and structured reasoning.<n>Recent studies have explored integrating knowledge graphs with LLMs for knowledge graph question answering.<n>We propose RAPL, a novel framework for efficient and effective graph retrieval in KGQA.
arXiv Detail & Related papers (2025-06-11T12:03:52Z)
Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation [75.9865035064794]
Large language models (LLMs) have demonstrated remarkable capabilities, but still struggle with issues like hallucinations and outdated information.<n>Retrieval-augmented generation (RAG) addresses these issues by grounding LLM outputs in external knowledge with an Information Retrieval (IR) system.<n>We propose Align-GRAG, a novel reasoning-guided dual alignment framework in post-retrieval phrase.
arXiv Detail & Related papers (2025-05-22T05:15:27Z)
RAKG:Document-level Retrieval Augmented Knowledge Graph Construction [10.013667560362565]
This paper focuses on the task of automatic document-level knowledge graph construction.<n>It proposes the Document-level Retrieval Augmented Knowledge Graph Construction (RAKG) framework.
arXiv Detail & Related papers (2025-04-14T02:47:23Z)
RGL: A Graph-Centric, Modular Framework for Efficient Retrieval-Augmented Generation on Graphs [58.10503898336799]
We introduce the RAG-on-Graphs Library (RGL), a modular framework that seamlessly integrates the complete RAG pipeline.<n>RGL addresses key challenges by supporting a variety of graph formats and integrating optimized implementations for essential components.<n>Our evaluations demonstrate that RGL not only accelerates the prototyping process but also enhances the performance and applicability of graph-based RAG systems.
arXiv Detail & Related papers (2025-03-25T03:21:48Z)
GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation [84.41557981816077]
We introduce GFM-RAG, a novel graph foundation model (GFM) for retrieval augmented generation.<n>GFM-RAG is powered by an innovative graph neural network that reasons over graph structure to capture complex query-knowledge relationships.<n>It achieves state-of-the-art performance while maintaining efficiency and alignment with neural scaling laws.
arXiv Detail & Related papers (2025-02-03T07:04:29Z)
Graph Structure Refinement with Energy-based Contrastive Learning [56.957793274727514]
We introduce an unsupervised method based on a joint of generative training and discriminative training to learn graph structure and representation.<n>We propose an Energy-based Contrastive Learning (ECL) guided Graph Structure Refinement (GSR) framework, denoted as ECL-GSR.<n>ECL-GSR achieves faster training with fewer samples and memories against the leading baseline, highlighting its simplicity and efficiency in downstream tasks.
arXiv Detail & Related papers (2024-12-20T04:05:09Z)
TOBUGraph: Knowledge Graph-Based Retrieval for Enhanced LLM Performance Beyond RAG [3.8704987495086542]
TOBUGraph is a graph-based retrieval framework that first constructs the knowledge graph from unstructured data.<n>It extracts structured knowledge and diverse relationships among data, going beyond RAG's text-to-text similarity.<n>We demonstrate TOBUGraph's effectiveness in TOBU, a real-world application in production for personal memory organization and retrieval.
arXiv Detail & Related papers (2024-12-06T22:05:39Z)
Simple Is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation [9.844598565914055]
Large Language Models (LLMs) demonstrate strong reasoning abilities but face limitations such as hallucinations and outdated knowledge.<n>We introduce SubgraphRAG, extending the Knowledge Graph (KG)-based Retrieval-Augmented Generation (RAG) framework that retrieves subgraphs.<n>Our approach innovatively integrates a lightweight multilayer perceptron with a parallel triple-scoring mechanism for efficient and flexible subgraph retrieval.
arXiv Detail & Related papers (2024-10-28T04:39:32Z)
All Against Some: Efficient Integration of Large Language Models for Message Passing in Graph Neural Networks [51.19110891434727]
Large Language Models (LLMs) with pretrained knowledge and powerful semantic comprehension abilities have recently shown a remarkable ability to benefit applications using vision and text data. E-LLaGNN is a framework with an on-demand LLM service that enriches message passing procedure of graph learning by enhancing a limited fraction of nodes from the graph.
arXiv Detail & Related papers (2024-07-20T22:09:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.