SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2412.15272v1
- Date: Tue, 17 Dec 2024 15:40:08 GMT
- Title: SimGRAG: Leveraging Similar Subgraphs for Knowledge Graphs Driven Retrieval-Augmented Generation
- Authors: Yuzheng Cai, Zhenyue Guo, Yiwen Pei, Wanrui Bian, Weiguo Zheng,
- Abstract summary: We propose a novel Similar Graph Enhanced Retrieval-Augmented Generation (SimGRAG) method.
It effectively addresses the challenge of aligning query texts and knowledge graphs.
SimGRAG outperforms state-of-the-art KG-driven RAG methods in question answering and fact verification.
- Score: 6.568733377722896
- License:
- Abstract: Recent advancements in large language models (LLMs) have shown impressive versatility across various tasks. To eliminate its hallucinations, retrieval-augmented generation (RAG) has emerged as a powerful approach, leveraging external knowledge sources like knowledge graphs (KGs). In this paper, we study the task of KG-driven RAG and propose a novel Similar Graph Enhanced Retrieval-Augmented Generation (SimGRAG) method. It effectively addresses the challenge of aligning query texts and KG structures through a two-stage process: (1) query-to-pattern, which uses an LLM to transform queries into a desired graph pattern, and (2) pattern-to-subgraph, which quantifies the alignment between the pattern and candidate subgraphs using a graph semantic distance (GSD) metric. We also develop an optimized retrieval algorithm that efficiently identifies the top-$k$ subgraphs within 1-second latency on a 10-million-scale KG. Extensive experiments show that SimGRAG outperforms state-of-the-art KG-driven RAG methods in both question answering and fact verification, offering superior plug-and-play usability and scalability.
Related papers
- Knowledge Graph-Guided Retrieval Augmented Generation [34.83235788116369]
We propose a Knowledge Graph-Guided Retrieval Augmented Generation framework.
KG$2$RAG provides fact-level relationships between chunks, improving the diversity and coherence of the retrieved results.
arXiv Detail & Related papers (2025-02-08T02:14:31Z) - GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation [84.41557981816077]
We introduce GFM-RAG, a novel graph foundation model (GFM) for retrieval augmented generation.
GFM-RAG is powered by an innovative graph neural network that reasons over graph structure to capture complex query-knowledge relationships.
It achieves state-of-the-art performance while maintaining efficiency and alignment with neural scaling laws.
arXiv Detail & Related papers (2025-02-03T07:04:29Z) - CG-RAG: Research Question Answering by Citation Graph Retrieval-Augmented LLMs [9.718354494802002]
Contextualized Graph Retrieval-Augmented Generation (CG-RAG) is a novel framework that integrates sparse and dense retrieval signals within graph structures.
First, we propose a contextual graph representation for citation graphs, effectively capturing both explicit and implicit connections within and across documents.
Second, we introduce Lexical-Semantic Graph Retrieval (LeSeGR), which seamlessly integrates sparse and dense retrieval signals with graph encoding.
Third, we present a context-aware generation strategy that utilizes the retrieved graph-structured information to generate precise and contextually enriched responses.
arXiv Detail & Related papers (2025-01-25T04:18:08Z) - DynaGRAG | Exploring the Topology of Information for Advancing Language Understanding and Generation in Graph Retrieval-Augmented Generation [0.0]
A novel GRAG framework, Dynamic Graph Retrieval-Agumented Generation (DynaGRAG), is proposed to focus on enhancing subgraph representation and diversity within the knowledge graph.
Experimental results demonstrate the effectiveness of DynaGRAG, showcasing the significance of enhanced subgraph representation and diversity for improved language understanding and generation.
arXiv Detail & Related papers (2024-12-24T16:06:53Z) - Instance-Aware Graph Prompt Learning [71.26108600288308]
We introduce Instance-Aware Graph Prompt Learning (IA-GPL) in this paper.
The process involves generating intermediate prompts for each instance using a lightweight architecture.
Experiments conducted on multiple datasets and settings showcase the superior performance of IA-GPL compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-11-26T18:38:38Z) - Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation [14.448198170932226]
Think-on-Graph 2.0 (ToG-2) is a hybrid RAG framework that iteratively retrieves information from both unstructured and structured knowledge sources.
ToG-2 alternates between graph retrieval and context retrieval to search for in-depth clues relevant to the question.
It achieves overall state-of-the-art (SOTA) performance on 6 out of 7 knowledge-intensive datasets with GPT-3.5.
arXiv Detail & Related papers (2024-07-15T15:20:40Z) - GSINA: Improving Subgraph Extraction for Graph Invariant Learning via
Graph Sinkhorn Attention [52.67633391931959]
Graph invariant learning (GIL) has been an effective approach to discovering the invariant relationships between graph data and its labels.
We propose a novel graph attention mechanism called Graph Sinkhorn Attention (GSINA)
GSINA is able to obtain meaningful, differentiable invariant subgraphs with controllable sparsity and softness.
arXiv Detail & Related papers (2024-02-11T12:57:16Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - Text-Augmented Open Knowledge Graph Completion via Pre-Trained Language
Models [53.09723678623779]
We propose TAGREAL to automatically generate quality query prompts and retrieve support information from large text corpora.
The results show that TAGREAL achieves state-of-the-art performance on two benchmark datasets.
We find that TAGREAL has superb performance even with limited training data, outperforming existing embedding-based, graph-based, and PLM-based methods.
arXiv Detail & Related papers (2023-05-24T22:09:35Z) - Graph-augmented Learning to Rank for Querying Large-scale Knowledge
Graph [34.774049199809426]
Knowledge graph question answering (i.e., KGQA) based on information retrieval aims to answer a question by retrieving answer from a large-scale knowledge graph.
We first propose to partition the retrieved KSG to several smaller sub-KSGs via a new subgraph partition algorithm.
We then present a graph-augmented learning to rank model to select the top-ranked sub-KSGs from them.
arXiv Detail & Related papers (2021-11-20T08:27:37Z) - Toward Subgraph-Guided Knowledge Graph Question Generation with Graph
Neural Networks [53.58077686470096]
Knowledge graph (KG) question generation (QG) aims to generate natural language questions from KGs and target answers.
In this work, we focus on a more realistic setting where we aim to generate questions from a KG subgraph and target answers.
arXiv Detail & Related papers (2020-04-13T15:43:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.