GraphRAG-Bench: Challenging Domain-Specific Reasoning for Evaluating Graph Retrieval-Augmented Generation
- URL: http://arxiv.org/abs/2506.02404v1
- Date: Tue, 03 Jun 2025 03:44:26 GMT
- Title: GraphRAG-Bench: Challenging Domain-Specific Reasoning for Evaluating Graph Retrieval-Augmented Generation
- Authors: Yilin Xiao, Junnan Dong, Chuang Zhou, Su Dong, Qianwen Zhang, Di Yin, Xing Sun, Xiao Huang,
- Abstract summary: Graph Retrieval Augmented Generation (GraphRAG) has garnered increasing recognition for its potential to enhance large language models (LLMs)<n>Current evaluations of GraphRAG models predominantly rely on traditional question-answering datasets.<n>We introduce GraphRAG-Bench, a large-scale, domain-specific benchmark designed to rigorously evaluate GraphRAG models.
- Score: 26.654064783342545
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph Retrieval Augmented Generation (GraphRAG) has garnered increasing recognition for its potential to enhance large language models (LLMs) by structurally organizing domain-specific corpora and facilitating complex reasoning. However, current evaluations of GraphRAG models predominantly rely on traditional question-answering datasets. Their limited scope in questions and evaluation metrics fails to comprehensively assess the reasoning capacity improvements enabled by GraphRAG models. To address this gap, we introduce GraphRAG-Bench, a large-scale, domain-specific benchmark designed to rigorously evaluate GraphRAG models. Our benchmark offers three key superiorities: \((i)\) Challenging question design. Featuring college-level, domain-specific questions that demand multi-hop reasoning, the benchmark ensures that simple content retrieval is insufficient for problem-solving. For example, some questions require mathematical reasoning or programming. \((ii)\) Diverse task coverage. The dataset includes a broad spectrum of reasoning tasks, multiple-choice, true/false, multi-select, open-ended, and fill-in-the-blank. It spans 16 disciplines in twenty core textbooks. \((iii)\) Holistic evaluation framework. GraphRAG-Bench provides comprehensive assessment across the entire GraphRAG pipeline, including graph construction, knowledge retrieval, and answer generation. Beyond final-answer correctness, it evaluates the logical coherence of the reasoning process. By applying nine contemporary GraphRAG methods to GraphRAG-Bench, we demonstrate its utility in quantifying how graph-based structuring improves model reasoning capabilities. Our analysis reveals critical insights about graph architectures, retrieval efficacy, and reasoning capabilities, offering actionable guidance for the research community.
Related papers
- XGraphRAG: Interactive Visual Analysis for Graph-based Retrieval-Augmented Generation [16.068460356582648]
This research proposes a visual analysis framework that helps RAG developers identify critical recalls of GraphRAG.<n>We develop XGraphRAG, a prototype system incorporating a set of interactive visualizations to facilitate users' analysis process.
arXiv Detail & Related papers (2025-06-10T09:14:30Z) - When to use Graphs in RAG: A Comprehensive Analysis for Graph Retrieval-Augmented Generation [25.508719115522645]
Graph retrieval-augmented generation (GraphRAG) has emerged as a powerful paradigm for enhancing large language models (LLMs) with external knowledge.<n>Recent studies report that GraphRAG frequently underperforms vanilla RAG on many real-world tasks.<n>This raises a critical question: Is GraphRAG really effective, and in which scenarios do graph structures provide measurable benefits for RAG systems?
arXiv Detail & Related papers (2025-06-06T02:37:47Z) - Align-GRAG: Reasoning-Guided Dual Alignment for Graph Retrieval-Augmented Generation [75.9865035064794]
Large language models (LLMs) have demonstrated remarkable capabilities, but still struggle with issues like hallucinations and outdated information.<n>Retrieval-augmented generation (RAG) addresses these issues by grounding LLM outputs in external knowledge with an Information Retrieval (IR) system.<n>We propose Align-GRAG, a novel reasoning-guided dual alignment framework in post-retrieval phrase.
arXiv Detail & Related papers (2025-05-22T05:15:27Z) - NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes [25.173078967881803]
Retrieval-augmented generation (RAG) empowers large language models to access external and private corpus.<n>Current graph-based RAG approaches seldom prioritize the design of graph structures.<n>Inadequately designed graph not only impede the seamless integration of diverse graph algorithms but also result in workflow inconsistencies.<n>We propose NodeRAG, a graph-centric framework introducing heterogeneous graph structures.
arXiv Detail & Related papers (2025-04-15T18:24:00Z) - Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs)<n>This framework provides a standardized setting to evaluate GNNs across diverse datasets.<n>We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z) - Retrieval-Augmented Generation with Graphs (GraphRAG) [84.29507404866257]
Retrieval-augmented generation (RAG) is a powerful technique that enhances downstream task execution by retrieving additional information.<n>Graph, by its intrinsic "nodes connected by edges" nature, encodes massive heterogeneous and relational information.<n>Unlike conventional RAG, the uniqueness of graph-structured data, such as diverse-formatted and domain-specific relational knowledge, poses unique and significant challenges when designing GraphRAG for different domains.
arXiv Detail & Related papers (2024-12-31T06:59:35Z) - LEGO-GraphRAG: Modularizing Graph-based Retrieval-Augmented Generation for Design Space Exploration [17.514586423233872]
We propose LEGO-GraphRAG, a modular framework that enables fine-grained decomposition of the GraphRAG workflow.<n>Our framework facilitates comprehensive empirical studies of GraphRAG on large-scale real-world graphs and diverse query sets.
arXiv Detail & Related papers (2024-11-06T15:32:28Z) - Graph Retrieval-Augmented Generation: A Survey [28.979898837538958]
Retrieval-Augmented Generation (RAG) has achieved remarkable success in addressing the challenges of Large Language Models (LLMs) without necessitating retraining.
This paper provides the first comprehensive overview of GraphRAG methodologies.
We formalize the GraphRAG workflow, encompassing Graph-Based Indexing, Graph-Guided Retrieval, and Graph-Enhanced Generation.
arXiv Detail & Related papers (2024-08-15T12:20:24Z) - G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [61.93058781222079]
We develop a flexible question-answering framework targeting real-world textual graphs.
We introduce the first retrieval-augmented generation (RAG) approach for general textual graphs.
G-Retriever performs RAG over a graph by formulating this task as a Prize-Collecting Steiner Tree optimization problem.
arXiv Detail & Related papers (2024-02-12T13:13:04Z) - Neural Graph Reasoning: Complex Logical Query Answering Meets Graph
Databases [63.96793270418793]
Complex logical query answering (CLQA) is a recently emerged task of graph machine learning.
We introduce the concept of Neural Graph Database (NGDBs)
NGDB consists of a Neural Graph Storage and a Neural Graph Engine.
arXiv Detail & Related papers (2023-03-26T04:03:37Z) - Graph Pooling for Graph Neural Networks: Progress, Challenges, and
Opportunities [128.55790219377315]
Graph neural networks have emerged as a leading architecture for many graph-level tasks.
graph pooling is indispensable for obtaining a holistic graph-level representation of the whole graph.
arXiv Detail & Related papers (2022-04-15T04:02:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.