Research on Graph-Retrieval Augmented Generation Based on Historical Text Knowledge Graphs
- URL: http://arxiv.org/abs/2506.15241v1
- Date: Wed, 18 Jun 2025 08:20:29 GMT
- Title: Research on Graph-Retrieval Augmented Generation Based on Historical Text Knowledge Graphs
- Authors: Yang Fan, Zhang Qi, Xing Wenqian, Liu Chang, Liu Liu,
- Abstract summary: This article addresses domain knowledge gaps in general large language models for historical text analysis.<n>We propose the Graph RAG framework, combining chain-of-thought prompting, self-instruction generation, and process supervision.<n>This dataset supports automated historical knowledge extraction, reducing labor costs.
- Score: 6.350401830141683
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This article addresses domain knowledge gaps in general large language models for historical text analysis in the context of computational humanities and AIGC technology. We propose the Graph RAG framework, combining chain-of-thought prompting, self-instruction generation, and process supervision to create a The First Four Histories character relationship dataset with minimal manual annotation. This dataset supports automated historical knowledge extraction, reducing labor costs. In the graph-augmented generation phase, we introduce a collaborative mechanism between knowledge graphs and retrieval-augmented generation, improving the alignment of general models with historical knowledge. Experiments show that the domain-specific model Xunzi-Qwen1.5-14B, with Simplified Chinese input and chain-of-thought prompting, achieves optimal performance in relation extraction (F1 = 0.68). The DeepSeek model integrated with GraphRAG improves F1 by 11% (0.08-0.19) on the open-domain C-CLUE relation extraction dataset, surpassing the F1 value of Xunzi-Qwen1.5-14B (0.12), effectively alleviating hallucinations phenomenon, and improving interpretability. This framework offers a low-resource solution for classical text knowledge extraction, advancing historical knowledge services and humanities research.
Related papers
- Graph Collaborative Attention Network for Link Prediction in Knowledge Graphs [0.0]
We focus on KBGAT, a graph neural network model that leverages multi-head attention to jointly encode both entity and relation features within local neighborhood structures.<n>We introduce textbfGCAT (Graph Collaborative Attention Network), a refined model that enhances context aggregation and interaction between heterogeneous nodes.<n>Our findings highlight the advantages of attention-based architectures in capturing complex relational patterns for knowledge graph completion tasks.
arXiv Detail & Related papers (2025-07-05T08:13:09Z) - ArchRAG: Attributed Community-based Hierarchical Retrieval-Augmented Generation [16.204046295248546]
Retrieval-Augmented Generation (RAG) has proven effective in integrating external knowledge into large language models (LLMs)<n>We introduce a novel graph-based RAG approach, called Attributed Community-based Hierarchical RAG (ArchRAG)<n>We build a novel hierarchical index structure for the attributed communities and develop an effective online retrieval method.<n>ArchRAG has been successfully applied to domain knowledge QA in Huawei Cloud Computing.
arXiv Detail & Related papers (2025-02-14T03:28:36Z) - GFM-RAG: Graph Foundation Model for Retrieval Augmented Generation [84.41557981816077]
We introduce GFM-RAG, a novel graph foundation model (GFM) for retrieval augmented generation.<n>GFM-RAG is powered by an innovative graph neural network that reasons over graph structure to capture complex query-knowledge relationships.<n>It achieves state-of-the-art performance while maintaining efficiency and alignment with neural scaling laws.
arXiv Detail & Related papers (2025-02-03T07:04:29Z) - Graph-Augmented Relation Extraction Model with LLMs-Generated Support Document [7.0421339410165045]
This study introduces a novel approach to sentence-level relation extraction (RE)
It integrates Graph Neural Networks (GNNs) with Large Language Models (LLMs) to generate contextually enriched support documents.
Our experiments, conducted on the CrossRE dataset, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-30T20:48:34Z) - Deep Contrastive Graph Learning with Clustering-Oriented Guidance [61.103996105756394]
Graph Convolutional Network (GCN) has exhibited remarkable potential in improving graph-based clustering.
Models estimate an initial graph beforehand to apply GCN.
Deep Contrastive Graph Learning (DCGL) model is proposed for general data clustering.
arXiv Detail & Related papers (2024-02-25T07:03:37Z) - CADGE: Context-Aware Dialogue Generation Enhanced with Graph-Structured Knowledge Aggregation [25.56539617837482]
A novel context-aware graph-attention model (Context-aware GAT) is proposed.
It assimilates global features from relevant knowledge graphs through a context-enhanced knowledge aggregation mechanism.
Empirical results demonstrate that our framework outperforms conventional GNN-based language models in terms of performance.
arXiv Detail & Related papers (2023-05-10T16:31:35Z) - LUKE-Graph: A Transformer-based Approach with Gated Relational Graph
Attention for Cloze-style Reading Comprehension [13.173307471333619]
We propose the LUKE-Graph, a model that builds a heterogeneous graph based on the intuitive relationships between entities in a document.
We then use the Attention reading (RGAT) to fuse the graph's reasoning information and the contextual representation encoded by the pre-trained LUKE model.
Experimental results demonstrate that the LUKE-Graph achieves state-of-the-art performance with commonsense reasoning.
arXiv Detail & Related papers (2023-03-12T14:31:44Z) - EGRC-Net: Embedding-induced Graph Refinement Clustering Network [66.44293190793294]
We propose a novel graph clustering network called Embedding-Induced Graph Refinement Clustering Network (EGRC-Net)
EGRC-Net effectively utilizes the learned embedding to adaptively refine the initial graph and enhance the clustering performance.
Our proposed methods consistently outperform several state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-19T09:08:43Z) - Heterogeneous Graph Neural Networks using Self-supervised Reciprocally
Contrastive Learning [102.9138736545956]
Heterogeneous graph neural network (HGNN) is a very popular technique for the modeling and analysis of heterogeneous graphs.
We develop for the first time a novel and robust heterogeneous graph contrastive learning approach, namely HGCL, which introduces two views on respective guidance of node attributes and graph topologies.
In this new approach, we adopt distinct but most suitable attribute and topology fusion mechanisms in the two views, which are conducive to mining relevant information in attributes and topologies separately.
arXiv Detail & Related papers (2022-04-30T12:57:02Z) - Federated Knowledge Graphs Embedding [50.35484170815679]
We propose a novel decentralized scalable learning framework, Federated Knowledge Graphs Embedding (FKGE)
FKGE exploits adversarial generation between pairs of knowledge graphs to translate identical entities and relations of different domains into near embedding spaces.
In order to protect the privacy of the training data, FKGE further implements a privacy-preserving neural network structure to guarantee no raw data leakage.
arXiv Detail & Related papers (2021-05-17T05:30:41Z) - Reasoning with Latent Structure Refinement for Document-Level Relation
Extraction [20.308845516900426]
We propose a novel model that empowers the relational reasoning across sentences by automatically inducing the latent document-level graph.
Specifically, our model achieves an F1 score of 59.05 on a large-scale document-level dataset (DocRED)
arXiv Detail & Related papers (2020-05-13T13:36:09Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z) - Graph Representation Learning via Graphical Mutual Information
Maximization [86.32278001019854]
We propose a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graphs and high-level hidden representations.
We develop an unsupervised learning model trained by maximizing GMI between the input and output of a graph neural encoder.
arXiv Detail & Related papers (2020-02-04T08:33:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.