GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding
- URL: http://arxiv.org/abs/2402.06764v3
- Date: Wed, 17 Apr 2024 19:55:37 GMT
- Title: GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding
- Authors: Stefan Dernbach, Khushbu Agarwal, Alejandro Zuniga, Michael Henry, Sutanay Choudhury,
- Abstract summary: We introduce a framework for developing Graph-aligned LAnguage Models (GLaM)
We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning.
- Score: 39.67113788660731
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Integrating large language models (LLMs) with knowledge graphs derived from domain-specific data represents an important advancement towards more powerful and factual reasoning. As these models grow more capable, it is crucial to enable them to perform multi-step inferences over real-world knowledge graphs while minimizing hallucination. While large language models excel at conversation and text generation, their ability to reason over domain-specialized graphs of interconnected entities remains limited. For example, can we query a LLM to identify the optimal contact in a professional network for a specific goal, based on relationships and attributes in a private database? The answer is no--such capabilities lie beyond current methods. However, this question underscores a critical technical gap that must be addressed. Many high-value applications in areas such as science, security, and e-commerce rely on proprietary knowledge graphs encoding unique structures, relationships, and logical constraints. We introduce a fine-tuning framework for developing Graph-aligned LAnguage Models (GLaM) that transforms a knowledge graph into an alternate text representation with labeled question-answer pairs. We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning. Our methodology leverages the large-language model's generative capabilities to create the dataset and proposes an efficient alternate to retrieval-augmented generation styled methods.
Related papers
- Verbalized Graph Representation Learning: A Fully Interpretable Graph Model Based on Large Language Models Throughout the Entire Process [8.820909397907274]
We propose a verbalized graph representation learning (VGRL) method which is fully interpretable.
In contrast to traditional graph machine learning models, VGRL constrains this parameter space to be text description.
We conduct several studies to empirically evaluate the effectiveness of VGRL.
arXiv Detail & Related papers (2024-10-02T12:07:47Z) - UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs [30.635472655668078]
Text-Attributed Graphs (TAGs) can generalize to unseen graphs and tasks across diverse domains.
We propose a novel cascaded architecture of Language Models (LMs) and Graph Neural Networks (GNNs) as backbone networks.
We demonstrate the model's effectiveness in self-supervised representation learning on unseen graphs, few-shot in-context transfer, and zero-shot transfer.
arXiv Detail & Related papers (2024-02-21T09:06:31Z) - Large Language Models on Graphs: A Comprehensive Survey [77.16803297418201]
We provide a systematic review of scenarios and techniques related to large language models on graphs.
We first summarize potential scenarios of adopting LLMs on graphs into three categories, namely pure graphs, text-attributed graphs, and text-paired graphs.
We discuss the real-world applications of such methods and summarize open-source codes and benchmark datasets.
arXiv Detail & Related papers (2023-12-05T14:14:27Z) - GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language
Models [33.56759621666477]
We present a benchmark dataset for evaluating the integration of graph knowledge into language models.
The proposed dataset is designed to evaluate graph-language models' ability to understand graphs and make use of it for answer generation.
We perform experiments with language-only models and the proposed graph-language model to validate the usefulness of the paired graphs and to demonstrate the difficulty of the task.
arXiv Detail & Related papers (2023-10-12T16:46:58Z) - Exploring Large Language Model for Graph Data Understanding in Online
Job Recommendations [63.19448893196642]
We present a novel framework that harnesses the rich contextual information and semantic representations provided by large language models to analyze behavior graphs.
By leveraging this capability, our framework enables personalized and accurate job recommendations for individual users.
arXiv Detail & Related papers (2023-07-10T11:29:41Z) - Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction [104.29108668347727]
This paper proposes an innovative knowledge graph generation approach that leverages the potential of the latest generative large language models.
The approach is conveyed in a pipeline that comprises novel iterative zero-shot and external knowledge-agnostic strategies.
We claim that our proposal is a suitable solution for scalable and versatile knowledge graph construction and may be applied to different and novel contexts.
arXiv Detail & Related papers (2023-07-03T16:01:45Z) - GPT4Graph: Can Large Language Models Understand Graph Structured Data ?
An Empirical Evaluation and Benchmarking [17.7473474499538]
Large language models like ChatGPT have become indispensable to artificial general intelligence.
In this study, we conduct an investigation to assess the proficiency of LLMs in comprehending graph data.
Our findings contribute valuable insights towards bridging the gap between language models and graph understanding.
arXiv Detail & Related papers (2023-05-24T11:53:19Z) - KGLM: Integrating Knowledge Graph Structure in Language Models for Link
Prediction [0.0]
We introduce a new entity/relation embedding layer that learns to differentiate distinctive entity and relation types.
We show that further pre-training the language models with this additional embedding layer using the triples extracted from the knowledge graph, followed by the standard fine-tuning phase sets a new state-of-the-art performance for the link prediction task on the benchmark datasets.
arXiv Detail & Related papers (2022-11-04T20:38:12Z) - ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge.
We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text.
We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z) - Exploiting Structured Knowledge in Text via Graph-Guided Representation
Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs.
Building upon entity-level masked language models, our first contribution is an entity masking scheme.
In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.