Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models
- URL: http://arxiv.org/abs/2501.18119v1
- Date: Thu, 30 Jan 2025 03:40:20 GMT
- Title: Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models
- Authors: Qika Lin, Tianzhe Zhao, Kai He, Zhen Peng, Fangzhi Xu, Ling Huang, Jingying Ma, Mengling Feng,
- Abstract summary: We propose a framework to learn and apply quantized codes for each entity, aiming for the seamless integration of Knowledge Graphs with Large Language Models.
Experiment results demonstrate that SSQR outperforms existing unsupervised quantized methods, producing more distinguishable codes.
The fine-tuned LLaMA2 and LLaMA3.1 also have superior performance on KG link prediction and triple classification tasks.
- Score: 17.88134311726175
- License:
- Abstract: Due to the presence of the natural gap between Knowledge Graph (KG) structures and the natural language, the effective integration of holistic structural information of KGs with Large Language Models (LLMs) has emerged as a significant question. To this end, we propose a two-stage framework to learn and apply quantized codes for each entity, aiming for the seamless integration of KGs with LLMs. Firstly, a self-supervised quantized representation (SSQR) method is proposed to compress both KG structural and semantic knowledge into discrete codes (\ie, tokens) that align the format of language sentences. We further design KG instruction-following data by viewing these learned codes as features to directly input to LLMs, thereby achieving seamless integration. The experiment results demonstrate that SSQR outperforms existing unsupervised quantized methods, producing more distinguishable codes. Further, the fine-tuned LLaMA2 and LLaMA3.1 also have superior performance on KG link prediction and triple classification tasks, utilizing only 16 tokens per entity instead of thousands in conventional prompting methods.
Related papers
- GLTW: Joint Improved Graph Transformer and LLM via Three-Word Language for Knowledge Graph Completion [52.026016846945424]
We propose a new method called GLTW, which encodes the structural information of KGs and merges it with Large Language Models.
Specifically, we introduce an improved Graph Transformer (iGT) that effectively encodes subgraphs with both local and global structural information.
Also, we develop a subgraph-based multi-classification training objective, using all entities within KG as classification objects, to boost learning efficiency.
arXiv Detail & Related papers (2025-02-17T06:02:59Z) - Boosting Knowledge Graph-based Recommendations through Confidence-Aware Augmentation with Large Language Models [19.28217321004791]
Large Language Models (LLMs) offer a promising way to improve the quality and relevance of Knowledge Graphs for recommendation tasks.
We propose the Confidence-aware KG-based Recommendation Framework with LLM Augmentation (CKG-LLMA), a novel framework that combines KGs and LLMs for recommendation task.
The framework includes: (1) an LLM-based subgraph augmenter for enriching KGs with high-quality information, (2) a confidence-aware message propagation mechanism to filter noisy triplets, and (3) a dual-view contrastive learning method to integrate user-item interactions and KG data.
arXiv Detail & Related papers (2025-02-06T02:06:48Z) - Bridge: A Unified Framework to Knowledge Graph Completion via Language Models and Knowledge Representation [14.801411392475439]
We propose a novel framework called Bridge, which jointly encodes structural and semantic information of Knowledge Graphs (KGs)
Specifically, we strategically encode entities and relations separately by PLMs to better utilize the semantic knowledge of PLMs.
To bridge the gap between KGs and PLMs, we employ a self-supervised representation learning method called BYOL to fine-tune PLMs with two different views of a triple.
arXiv Detail & Related papers (2024-11-11T01:59:04Z) - Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains [66.55612528039894]
Knowledge Graphs (KGs) can serve as reliable knowledge sources for question answering (QA)
We present DoG (Decoding on Graphs), a novel framework that facilitates a deep synergy between LLMs and KGs.
Experiments across various KGQA tasks with different background KGs demonstrate that DoG achieves superior and robust performance.
arXiv Detail & Related papers (2024-10-24T04:01:40Z) - Knowledge Graph-Enhanced Large Language Models via Path Selection [58.228392005755026]
Large Language Models (LLMs) have shown unprecedented performance in various real-world applications.
LLMs are known to generate factually inaccurate outputs, a.k.a. the hallucination problem.
We propose a principled framework KELP with three stages to handle the above problems.
arXiv Detail & Related papers (2024-06-19T21:45:20Z) - Generate-on-Graph: Treat LLM as both Agent and KG in Incomplete Knowledge Graph Question Answering [87.67177556994525]
We propose a training-free method called Generate-on-Graph (GoG) to generate new factual triples while exploring Knowledge Graphs (KGs)
GoG performs reasoning through a Thinking-Searching-Generating framework, which treats LLM as both Agent and KG in IKGQA.
arXiv Detail & Related papers (2024-04-23T04:47:22Z) - Knowledge Graph Large Language Model (KG-LLM) for Link Prediction [43.55117421485917]
We introduce the Knowledge Graph Large Language Model (KG-LLM), a novel framework that leverages large language models (LLMs) for knowledge graph tasks.
We first convert structured knowledge graph data into natural language and then use these natural language prompts to fine-tune LLMs.
To show the efficacy of the KG-LLM Framework, we fine-tune three leading LLMs within this framework, including Flan-T5, LLaMa2 and Gemma.
arXiv Detail & Related papers (2024-03-12T04:47:29Z) - Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - FedMKGC: Privacy-Preserving Federated Multilingual Knowledge Graph
Completion [21.4302940596294]
Knowledge graph completion (KGC) aims to predict missing facts in knowledge graphs (KGs)
Previous methods that rely on transferring raw data among KGs raise privacy concerns.
We propose a new federated learning framework that implicitly aggregates knowledge from multiple KGs without demanding raw data exchange and entity alignment.
arXiv Detail & Related papers (2023-12-17T08:09:27Z) - Unifying Large Language Models and Knowledge Graphs: A Roadmap [61.824618473293725]
Large language models (LLMs) are making new waves in the field of natural language processing and artificial intelligence.
Knowledge Graphs (KGs), Wikipedia and Huapu for example, are structured knowledge models that explicitly store rich factual knowledge.
arXiv Detail & Related papers (2023-06-14T07:15:26Z) - Few-shot Knowledge Graph-to-Text Generation with Pretrained Language
Models [42.38563175680914]
This paper studies how to automatically generate a natural language text that describes the facts in knowledge graph (KG)
Considering the few-shot setting, we leverage the excellent capacities of pretrained language models (PLMs) in language understanding and generation.
arXiv Detail & Related papers (2021-06-03T06:48:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.