Contextualization Distillation from Large Language Model for Knowledge
Graph Completion
- URL: http://arxiv.org/abs/2402.01729v3
- Date: Sat, 24 Feb 2024 07:01:22 GMT
- Title: Contextualization Distillation from Large Language Model for Knowledge
Graph Completion
- Authors: Dawei Li, Zhen Tan, Tianlong Chen, Huan Liu
- Abstract summary: We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
- Score: 51.126166442122546
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While textual information significantly enhances the performance of
pre-trained language models (PLMs) in knowledge graph completion (KGC), the
static and noisy nature of existing corpora collected from Wikipedia articles
or synsets definitions often limits the potential of PLM-based KGC models. To
surmount these challenges, we introduce the Contextualization Distillation
strategy, a versatile plug-in-and-play approach compatible with both
discriminative and generative KGC frameworks. Our method begins by instructing
large language models (LLMs) to transform compact, structural triplets into
context-rich segments. Subsequently, we introduce two tailored auxiliary tasks,
reconstruction and contextualization, allowing smaller KGC models to assimilate
insights from these enriched triplets. Comprehensive evaluations across diverse
datasets and KGC techniques highlight the efficacy and adaptability of our
approach, revealing consistent performance enhancements irrespective of
underlying pipelines or architectures. Moreover, our analysis makes our method
more explainable and provides insight into generating path selection, as well
as the choosing of suitable distillation tasks. All the code and data in this
work will be released at
https://github.com/David-Li0406/Contextulization-Distillation
Related papers
- Deep Sparse Latent Feature Models for Knowledge Graph Completion [24.342670268545085]
In this paper, we introduce a novel framework of sparse latent feature models for knowledge graphs.
Our approach not only effectively completes missing triples but also provides clear interpretability of the latent structures.
Our method significantly improves performance by revealing latent communities and producing interpretable representations.
arXiv Detail & Related papers (2024-11-24T03:17:37Z) - Language Models are Graph Learners [70.14063765424012]
Language Models (LMs) are challenging the dominance of domain-specific models, including Graph Neural Networks (GNNs) and Graph Transformers (GTs)
We propose a novel approach that empowers off-the-shelf LMs to achieve performance comparable to state-of-the-art GNNs on node classification tasks.
arXiv Detail & Related papers (2024-10-03T08:27:54Z) - Multi-perspective Improvement of Knowledge Graph Completion with Large
Language Models [95.31941227776711]
We propose MPIKGC to compensate for the deficiency of contextualized knowledge and improve KGC by querying large language models (LLMs)
We conducted extensive evaluation of our framework based on four description-based KGC models and four datasets, for both link prediction and triplet classification tasks.
arXiv Detail & Related papers (2024-03-04T12:16:15Z) - Learning to Extract Structured Entities Using Language Models [52.281701191329]
Recent advances in machine learning have significantly impacted the field of information extraction.
We reformulate the task to be entity-centric, enabling the use of diverse metrics.
We contribute to the field by introducing Structured Entity Extraction and proposing the Approximate Entity Set OverlaP metric.
arXiv Detail & Related papers (2024-02-06T22:15:09Z) - KICGPT: Large Language Model with Knowledge in Context for Knowledge
Graph Completion [27.405080941584533]
We propose KICGPT, a framework that integrates a large language model and a triple-based KGC retriever.
It alleviates the long-tail problem without incurring additional training overhead.
Empirical results on benchmark datasets demonstrate the effectiveness of KICGPT with smaller training overhead and no finetuning.
arXiv Detail & Related papers (2024-02-04T08:01:07Z) - Bidirectional Trained Tree-Structured Decoder for Handwritten
Mathematical Expression Recognition [51.66383337087724]
The Handwritten Mathematical Expression Recognition (HMER) task is a critical branch in the field of OCR.
Recent studies have demonstrated that incorporating bidirectional context information significantly improves the performance of HMER models.
We propose the Mirror-Flipped Symbol Layout Tree (MF-SLT) and Bidirectional Asynchronous Training (BAT) structure.
arXiv Detail & Related papers (2023-12-31T09:24:21Z) - Unifying Structure and Language Semantic for Efficient Contrastive
Knowledge Graph Completion with Structured Entity Anchors [0.3913403111891026]
The goal of knowledge graph completion (KGC) is to predict missing links in a KG using trained facts that are already known.
We propose a novel method to effectively unify structure information and language semantics without losing the power of inductive reasoning.
arXiv Detail & Related papers (2023-11-07T11:17:55Z) - Enhancing Text-based Knowledge Graph Completion with Zero-Shot Large Language Models: A Focus on Semantic Enhancement [8.472388165833292]
We introduce a framework termed constrained prompts for KGC (CP-KGC)
This framework designs prompts that adapt to different datasets to enhance semantic richness.
This study extends the performance limits of existing models and promotes further integration of KGC with large language models.
arXiv Detail & Related papers (2023-10-12T12:31:23Z) - VEM$^2$L: A Plug-and-play Framework for Fusing Text and Structure
Knowledge on Sparse Knowledge Graph Completion [14.537509860565706]
We propose a plug-and-play framework VEM2L over sparse Knowledge Graphs to fuse knowledge extracted from text and structure messages into a unity.
Specifically, we partition knowledge acquired by models into two nonoverlapping parts.
We also propose a new fusion strategy proved by Variational EM algorithm to fuse the generalization ability of models.
arXiv Detail & Related papers (2022-07-04T15:50:21Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.