LOKE: Linked Open Knowledge Extraction for Automated Knowledge Graph
Construction
- URL: http://arxiv.org/abs/2311.09366v1
- Date: Wed, 15 Nov 2023 20:57:44 GMT
- Title: LOKE: Linked Open Knowledge Extraction for Automated Knowledge Graph
Construction
- Authors: Jamie McCusker
- Abstract summary: We investigate the use of GPT models and prompt engineering for knowledge graph construction with the Wikidata knowledge graph.
We show that a well engineered prompt, paired with a naive entity linking approach (which we call LOKE-GPT) outperforms AllenAI's OpenIE 4 implementation on the OKE task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While the potential of Open Information Extraction (Open IE) for Knowledge
Graph Construction (KGC) may seem promising, we find that the alignment of Open
IE extraction results with existing knowledge graphs to be inadequate. The
advent of Large Language Models (LLMs), especially the commercially available
OpenAI models, have reset expectations for what is possible with deep learning
models and have created a new field called prompt engineering. We investigate
the use of GPT models and prompt engineering for knowledge graph construction
with the Wikidata knowledge graph to address a similar problem to Open IE,
which we call Open Knowledge Extraction (OKE) using an approach we call the
Linked Open Knowledge Extractor (LOKE, pronounced like "Loki"). We consider the
entity linking task essential to construction of real world knowledge graphs.
We merge the CaRB benchmark scoring approach with data from the TekGen dataset
for the LOKE task. We then show that a well engineered prompt, paired with a
naive entity linking approach (which we call LOKE-GPT), outperforms AllenAI's
OpenIE 4 implementation on the OKE task, although it over-generates triples
compared to the reference set due to overall triple scarcity in the TekGen set.
Through an analysis of entity linkability in the CaRB dataset, as well as
outputs from OpenIE 4 and LOKE-GPT, we see that LOKE-GPT and the "silver"
TekGen triples show that the task is significantly different in content from
OIE, if not structure. Through this analysis and a qualitative analysis of
sentence extractions via all methods, we found that LOKE-GPT extractions are of
high utility for the KGC task and suitable for use in semi-automated extraction
settings.
Related papers
- Graphusion: A RAG Framework for Knowledge Graph Construction with a Global Perspective [13.905336639352404]
This work introduces Graphusion, a zero-shot Knowledge Graph framework from free text.
It contains three steps: in Step 1, we extract a list of seed entities using topic modeling to guide the final KG includes the most relevant entities.
In Step 2, we conduct candidate triplet extraction using LLMs; in Step 3, we design the novel fusion module that provides a global view of the extracted knowledge.
arXiv Detail & Related papers (2024-10-23T06:54:03Z) - Exploiting Large Language Models Capabilities for Question Answer-Driven Knowledge Graph Completion Across Static and Temporal Domains [8.472388165833292]
This paper introduces a new generative completion framework called Generative Subgraph-based KGC (GS-KGC)
GS-KGC employs a question-answering format to directly generate target entities, addressing the challenge of questions having multiple possible answers.
Our method generates negative samples using known facts to facilitate the discovery of new information.
arXiv Detail & Related papers (2024-08-20T13:13:41Z) - Combining Language and Graph Models for Semi-structured Information
Extraction on the Web [7.44454462555094]
We present GraphScholarBERT, an open-domain information extraction method based on a joint graph and language model structure.
Experiments show that GraphScholarBERT can improve extraction F1 scores by as much as 34.8% compared to previous work in a zero-shot domain and zero-shot website setting.
arXiv Detail & Related papers (2024-02-21T20:53:29Z) - Instruct and Extract: Instruction Tuning for On-Demand Information
Extraction [86.29491354355356]
On-Demand Information Extraction aims to fulfill the personalized demands of real-world users.
We present a benchmark named InstructIE, inclusive of both automatically generated training data, as well as the human-annotated test set.
Building on InstructIE, we further develop an On-Demand Information Extractor, ODIE.
arXiv Detail & Related papers (2023-10-24T17:54:25Z) - Text-Augmented Open Knowledge Graph Completion via Pre-Trained Language
Models [53.09723678623779]
We propose TAGREAL to automatically generate quality query prompts and retrieve support information from large text corpora.
The results show that TAGREAL achieves state-of-the-art performance on two benchmark datasets.
We find that TAGREAL has superb performance even with limited training data, outperforming existing embedding-based, graph-based, and PLM-based methods.
arXiv Detail & Related papers (2023-05-24T22:09:35Z) - UniEX: An Effective and Efficient Framework for Unified Information
Extraction via a Span-extractive Perspective [11.477764739452702]
We propose a new paradigm for universal information extraction (IE) that is compatible with any schema format.
Our approach converts the text-based IE tasks as the token-pair problem, which uniformly disassembles all extraction targets.
Experiment results show that UniEX can outperform generative universal IE models in terms of performance and inference-speed.
arXiv Detail & Related papers (2023-05-17T15:44:12Z) - Learning Intents behind Interactions with Knowledge Graph for
Recommendation [93.08709357435991]
Knowledge graph (KG) plays an increasingly important role in recommender systems.
Existing GNN-based models fail to identify user-item relation at a fine-grained level of intents.
We propose a new model, Knowledge Graph-based Intent Network (KGIN)
arXiv Detail & Related papers (2021-02-14T03:21:36Z) - OpenIE6: Iterative Grid Labeling and Coordination Analysis for Open
Information Extraction [36.439047786561396]
We present an iterative labeling-based system that establishes a new state of the art for OpenIE, while extracting 10x faster.
This is achieved through a novel Iterative Grid Labeling (IGL) architecture, which treats OpenIE as a 2-D grid labeling task.
Our OpenIE system, OpenIE6, beats the previous systems by as much as 4 pts in F1, while being much faster.
arXiv Detail & Related papers (2020-10-07T04:05:37Z) - KILT: a Benchmark for Knowledge Intensive Language Tasks [102.33046195554886]
We present a benchmark for knowledge-intensive language tasks (KILT)
All tasks in KILT are grounded in the same snapshot of Wikipedia.
We find that a shared dense vector index coupled with a seq2seq model is a strong baseline.
arXiv Detail & Related papers (2020-09-04T15:32:19Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z) - Mining Implicit Entity Preference from User-Item Interaction Data for
Knowledge Graph Completion via Adversarial Learning [82.46332224556257]
We propose a novel adversarial learning approach by leveraging user interaction data for the Knowledge Graph Completion task.
Our generator is isolated from user interaction data, and serves to improve the performance of the discriminator.
To discover implicit entity preference of users, we design an elaborate collaborative learning algorithms based on graph neural networks.
arXiv Detail & Related papers (2020-03-28T05:47:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.