Related papers: GRAG: Graph Retrieval-Augmented Generation

GRAG: Graph Retrieval-Augmented Generation

URL: http://arxiv.org/abs/2405.16506v1
Date: Sun, 26 May 2024 10:11:40 GMT
Title: GRAG: Graph Retrieval-Augmented Generation
Authors: Yuntong Hu, Zhihan Lei, Zheng Zhang, Bo Pan, Chen Ling, Liang Zhao,
Abstract summary: We introduce $textbfGraph Retrieval-Augmented Generation (GRAG)$, which significantly enhances both the retrieval and generation processes. Unlike RAG approaches that focus solely on text-based entity retrieval, GRAG maintains an acute awareness of graph topology. Our GRAG approach consists of four main stages: indexing of $k$-hop ego-graphs, graph retrieval, soft pruning to mitigate the impact of irrelevant entities, and generation with pruned textual subgraphs.
Score: 14.98084919101233
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While Retrieval-Augmented Generation (RAG) enhances the accuracy and relevance of responses by generative language models, it falls short in graph-based contexts where both textual and topological information are important. Naive RAG approaches inherently neglect the structural intricacies of textual graphs, resulting in a critical gap in the generation process. To address this challenge, we introduce $\textbf{Graph Retrieval-Augmented Generation (GRAG)}$, which significantly enhances both the retrieval and generation processes by emphasizing the importance of subgraph structures. Unlike RAG approaches that focus solely on text-based entity retrieval, GRAG maintains an acute awareness of graph topology, which is crucial for generating contextually and factually coherent responses. Our GRAG approach consists of four main stages: indexing of $k$-hop ego-graphs, graph retrieval, soft pruning to mitigate the impact of irrelevant entities, and generation with pruned textual subgraphs. GRAG's core workflow-retrieving textual subgraphs followed by soft pruning-efficiently identifies relevant subgraph structures while avoiding the computational infeasibility typical of exhaustive subgraph searches, which are NP-hard. Moreover, we propose a novel prompting strategy that achieves lossless conversion from textual subgraphs to hierarchical text descriptions. Extensive experiments on graph multi-hop reasoning benchmarks demonstrate that in scenarios requiring multi-hop reasoning on textual graphs, our GRAG approach significantly outperforms current state-of-the-art RAG methods while effectively mitigating hallucinations.

Related papers

Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation [75.9865035064794]
Large language models (LLMs) have demonstrated remarkable capabilities, but still struggle with issues like hallucinations and outdated information.<n>Retrieval-augmented generation (RAG) addresses these issues by grounding LLM outputs in external knowledge with an Information Retrieval (IR) system.<n>We propose Align-GRAG, a novel reasoning-guided dual alignment framework in post-retrieval phrase.
arXiv Detail & Related papers (2025-05-22T05:15:27Z)
Query-Aware Learnable Graph Pooling Tokens as Prompt for Large Language Models [3.9489815622117566]
Learnable Graph Pooling Token (LGPT) enables flexible and efficient graph representation.<n>Our method achieves a 4.13% performance improvement on the GraphQA benchmark without training the large language model.
arXiv Detail & Related papers (2025-01-29T10:35:41Z)
CG-RAG: Research Question Answering by Citation Graph Retrieval-Augmented LLMs [9.718354494802002]
Contextualized Graph Retrieval-Augmented Generation (CG-RAG) is a novel framework that integrates sparse and dense retrieval signals within graph structures. First, we propose a contextual graph representation for citation graphs, effectively capturing both explicit and implicit connections within and across documents. Second, we introduce Lexical-Semantic Graph Retrieval (LeSeGR), which seamlessly integrates sparse and dense retrieval signals with graph encoding. Third, we present a context-aware generation strategy that utilizes the retrieved graph-structured information to generate precise and contextually enriched responses.
arXiv Detail & Related papers (2025-01-25T04:18:08Z)
Evaluating and Improving Graph to Text Generation with Large Language Models [46.529034150391595]
Large language models (LLMs) have demonstrated immense potential across various tasks. We conduct a comprehensive evaluation of prompting current open-source LLMs on graph-to-text generation tasks. We introduce a new graph-to-text dataset, PlanGTG, annotated with two sub-tasks: reordering and attribution.
arXiv Detail & Related papers (2025-01-24T13:53:54Z)
Exploring Graph Structure Comprehension Ability of Multimodal Large Language Models: Case Studies [7.067145619709089]
This study investigates the impact of graph visualisations on Large Language Models (LLMs) performance. Our experiments compare the effectiveness of multimodal approaches against purely textual graph representations.
arXiv Detail & Related papers (2024-09-13T14:26:58Z)
Hierarchical Compression of Text-Rich Graphs via Large Language Models [63.75293588479027]
Text-rich graphs are prevalent in data mining contexts like e-commerce and academic graphs. This paper introduces Hierarchical Compression'' (HiCom), a novel method to align the capabilities of LLMs with the structure of text-rich graphs. HiCom can outperform both GNNs and LLM backbones for node classification on e-commerce and citation graphs.
arXiv Detail & Related papers (2024-06-13T07:24:46Z)
Parameter-Efficient Tuning Large Language Models for Graph Representation Learning [62.26278815157628]
We introduce Graph-aware. Efficient Fine-Tuning - GPEFT, a novel approach for efficient graph representation learning. We use a graph neural network (GNN) to encode structural information from neighboring nodes into a graph prompt. We validate our approach through comprehensive experiments conducted on 8 different text-rich graphs, observing an average improvement of 2% in hit@1 and Mean Reciprocal Rank (MRR) in link prediction evaluations.
arXiv Detail & Related papers (2024-04-28T18:36:59Z)
Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs [60.71360240206726]
Large language models (LLMs) suffer from hallucinations, especially on knowledge-intensive tasks. Existing works propose to augment LLMs with individual text units retrieved from external knowledge corpora. We propose a framework called Graph Chain-of-thought (Graph-CoT) to augment LLMs with graphs by encouraging LLMs to reason on the graph iteratively.
arXiv Detail & Related papers (2024-04-10T15:41:53Z)
G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [61.93058781222079]
We develop a flexible question-answering framework targeting real-world textual graphs. We introduce the first retrieval-augmented generation (RAG) approach for general textual graphs. G-Retriever performs RAG over a graph by formulating this task as a Prize-Collecting Steiner Tree optimization problem.
arXiv Detail & Related papers (2024-02-12T13:13:04Z)
When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding and Reasoning [54.84870836443311]
The paper presents a new paradigm for understanding and reasoning about graph data by integrating image encoding and multimodal technologies. This approach enables the comprehension of graph data through an instruction-response format, utilizing GPT-4V's advanced capabilities. The study evaluates this paradigm on various graph types, highlighting the model's strengths and weaknesses, particularly in Chinese OCR performance and complex reasoning tasks.
arXiv Detail & Related papers (2023-12-16T08:14:11Z)
Which Modality should I use -- Text, Motif, or Image? : Understanding Graphs with Large Language Models [14.251972223585765]
This paper introduces a new approach to encoding a graph with diverse modalities, such as text, image, and motif, and prompts to approximate a graph's global connectivity. The study also presents GraphTMI, a novel benchmark for evaluating Large Language Models (LLMs) in graph structure analysis.
arXiv Detail & Related papers (2023-11-16T12:45:41Z)
GRENADE: Graph-Centric Language Model for Self-Supervised Representation Learning on Text-Attributed Graphs [22.282756544376493]
We develop a novel Graph-Centric Language model, GRENADE, to solve the problem of self-supervised representation learning on text-attributed graphs. GRENADE exploits the synergistic effect of both pre-trained language model and graph neural network by optimizing with two specialized self-supervised learning algorithms. The proposed graph-centric self-supervised learning algorithms effectively help GRENADE to capture informative textual semantics as well as structural context information on text-attributed graphs.
arXiv Detail & Related papers (2023-10-23T17:18:35Z)
GraphGPT: Graph Instruction Tuning for Large Language Models [27.036935149004726]
Graph Neural Networks (GNNs) have evolved to understand graph structures. To enhance robustness, self-supervised learning (SSL) has become a vital tool for data augmentation. Our research tackles this by advancing graph model generalization in zero-shot learning environments.
arXiv Detail & Related papers (2023-10-19T06:17:46Z)
Improving Graph-Based Text Representations with Character and Word Level N-grams [30.699644290131044]
We propose a new word-character text graph that combines word and character n-gram nodes together with document nodes. We also propose two new graph-based neural models, WCTextGCN and WCTextGAT, for modeling our proposed text graph.
arXiv Detail & Related papers (2022-10-12T08:07:54Z)
Graph Pooling for Graph Neural Networks: Progress, Challenges, and Opportunities [128.55790219377315]
Graph neural networks have emerged as a leading architecture for many graph-level tasks. graph pooling is indispensable for obtaining a holistic graph-level representation of the whole graph.
arXiv Detail & Related papers (2022-04-15T04:02:06Z)
Promoting Graph Awareness in Linearized Graph-to-Text Generation [72.83863719868364]
We study the ability of linearized models to encode local graph structures. Our findings motivate solutions to enrich the quality of models' implicit graph encodings. We find that these denoising scaffolds lead to substantial improvements in downstream generation in low-resource settings.
arXiv Detail & Related papers (2020-12-31T18:17:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.