Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective
- URL: http://arxiv.org/abs/2410.17600v1
- Date: Wed, 23 Oct 2024 06:54:03 GMT
- Title: Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective
- Authors: Rui Yang, Boming Yang, Aosong Feng, Sixun Ouyang, Moritz Blum, Tianwei She, Yuang Jiang, Freddy Lecue, Jinghui Lu, Irene Li,
- Abstract summary: This work introduces Graphusion, a zero-shot Knowledge Graph framework from free text.
It contains three steps: in Step 1, we extract a list of seed entities using topic modeling to guide the final KG includes the most relevant entities.
In Step 2, we conduct candidate triplet extraction using LLMs; in Step 3, we design the novel fusion module that provides a global view of the extracted knowledge.
- Score: 13.905336639352404
- License:
- Abstract: Knowledge Graphs (KGs) are crucial in the field of artificial intelligence and are widely used in downstream tasks, such as question-answering (QA). The construction of KGs typically requires significant effort from domain experts. Large Language Models (LLMs) have recently been used for Knowledge Graph Construction (KGC). However, most existing approaches focus on a local perspective, extracting knowledge triplets from individual sentences or documents, missing a fusion process to combine the knowledge in a global KG. This work introduces Graphusion, a zero-shot KGC framework from free text. It contains three steps: in Step 1, we extract a list of seed entities using topic modeling to guide the final KG includes the most relevant entities; in Step 2, we conduct candidate triplet extraction using LLMs; in Step 3, we design the novel fusion module that provides a global view of the extracted knowledge, incorporating entity merging, conflict resolution, and novel triplet discovery. Results show that Graphusion achieves scores of 2.92 and 2.37 out of 3 for entity extraction and relation recognition, respectively. Moreover, we showcase how Graphusion could be applied to the Natural Language Processing (NLP) domain and validate it in an educational scenario. Specifically, we introduce TutorQA, a new expert-verified benchmark for QA, comprising six tasks and a total of 1,200 QA pairs. Using the Graphusion-constructed KG, we achieve a significant improvement on the benchmark, for example, a 9.2% accuracy improvement on sub-graph completion.
Related papers
- iText2KG: Incremental Knowledge Graphs Construction Using Large Language Models [0.7165255458140439]
iText2KG is a method for incremental, topic-independent Knowledge Graph construction without post-processing.
Our method demonstrates superior performance compared to baseline methods across three scenarios.
arXiv Detail & Related papers (2024-09-05T06:49:14Z) - Exploiting Large Language Models Capabilities for Question Answer-Driven Knowledge Graph Completion Across Static and Temporal Domains [8.472388165833292]
This paper introduces a new generative completion framework called Generative Subgraph-based KGC (GS-KGC)
GS-KGC employs a question-answering format to directly generate target entities, addressing the challenge of questions having multiple possible answers.
Our method generates negative samples using known facts to facilitate the discovery of new information.
arXiv Detail & Related papers (2024-08-20T13:13:41Z) - Graphusion: Leveraging Large Language Models for Scientific Knowledge Graph Fusion and Construction in NLP Education [14.368011453534596]
We introduce Graphusion, a zero-shot knowledge graph framework from free text.
The core fusion module provides a global view of triplets, incorporating entity merging, conflict resolution, and novel triplet discovery.
Our evaluation demonstrates that Graphusion surpasses supervised baselines by up to 10% in accuracy on link prediction.
arXiv Detail & Related papers (2024-07-15T15:13:49Z) - Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering [87.67177556994525]
We propose a training-free method called Generate-on-Graph (GoG) to generate new factual triples while exploring Knowledge Graphs (KGs)
GoG performs reasoning through a Thinking-Searching-Generating framework, which treats LLM as both Agent and KG in IKGQA.
arXiv Detail & Related papers (2024-04-23T04:47:22Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - Text-Augmented Open Knowledge Graph Completion via Pre-Trained Language
Models [53.09723678623779]
We propose TAGREAL to automatically generate quality query prompts and retrieve support information from large text corpora.
The results show that TAGREAL achieves state-of-the-art performance on two benchmark datasets.
We find that TAGREAL has superb performance even with limited training data, outperforming existing embedding-based, graph-based, and PLM-based methods.
arXiv Detail & Related papers (2023-05-24T22:09:35Z) - VEM$^2$L: A Plug-and-play Framework for Fusing Text and Structure
Knowledge on Sparse Knowledge Graph Completion [14.537509860565706]
We propose a plug-and-play framework VEM2L over sparse Knowledge Graphs to fuse knowledge extracted from text and structure messages into a unity.
Specifically, we partition knowledge acquired by models into two nonoverlapping parts.
We also propose a new fusion strategy proved by Variational EM algorithm to fuse the generalization ability of models.
arXiv Detail & Related papers (2022-07-04T15:50:21Z) - Collaborative Knowledge Graph Fusion by Exploiting the Open Corpus [59.20235923987045]
It is challenging to enrich a Knowledge Graph with newly harvested triples while maintaining the quality of the knowledge representation.
This paper proposes a system to refine a KG using information harvested from an additional corpus.
arXiv Detail & Related papers (2022-06-15T12:16:10Z) - KACC: A Multi-task Benchmark for Knowledge Abstraction, Concretization
and Completion [99.47414073164656]
A comprehensive knowledge graph (KG) contains an instance-level entity graph and an ontology-level concept graph.
The two-view KG provides a testbed for models to "simulate" human's abilities on knowledge abstraction, concretization, and completion.
We propose a unified KG benchmark by improving existing benchmarks in terms of dataset scale, task coverage, and difficulty.
arXiv Detail & Related papers (2020-04-28T16:21:57Z) - Toward Subgraph-Guided Knowledge Graph Question Generation with Graph
Neural Networks [53.58077686470096]
Knowledge graph (KG) question generation (QG) aims to generate natural language questions from KGs and target answers.
In this work, we focus on a more realistic setting where we aim to generate questions from a KG subgraph and target answers.
arXiv Detail & Related papers (2020-04-13T15:43:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.