Related papers: GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding

GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding

URL: http://arxiv.org/abs/2402.06764v3
Date: Wed, 17 Apr 2024 19:55:37 GMT
Title: GLaM: Fine-Tuning Large Language Models for Domain Knowledge Graph Alignment via Neighborhood Partitioning and Generative Subgraph Encoding
Authors: Stefan Dernbach, Khushbu Agarwal, Alejandro Zuniga, Michael Henry, Sutanay Choudhury,
Abstract summary: We introduce a framework for developing Graph-aligned LAnguage Models (GLaM) We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning.
Score: 39.67113788660731
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Integrating large language models (LLMs) with knowledge graphs derived from domain-specific data represents an important advancement towards more powerful and factual reasoning. As these models grow more capable, it is crucial to enable them to perform multi-step inferences over real-world knowledge graphs while minimizing hallucination. While large language models excel at conversation and text generation, their ability to reason over domain-specialized graphs of interconnected entities remains limited. For example, can we query a LLM to identify the optimal contact in a professional network for a specific goal, based on relationships and attributes in a private database? The answer is no--such capabilities lie beyond current methods. However, this question underscores a critical technical gap that must be addressed. Many high-value applications in areas such as science, security, and e-commerce rely on proprietary knowledge graphs encoding unique structures, relationships, and logical constraints. We introduce a fine-tuning framework for developing Graph-aligned LAnguage Models (GLaM) that transforms a knowledge graph into an alternate text representation with labeled question-answer pairs. We demonstrate that grounding the models in specific graph-based knowledge expands the models' capacity for structure-based reasoning. Our methodology leverages the large-language model's generative capabilities to create the dataset and proposes an efficient alternate to retrieval-augmented generation styled methods.

Related papers

Quantizing Text-attributed Graphs for Semantic-Structural Integration [6.721504414917793]
Text-attributed graphs (TAGs) have emerged as a powerful representation for modeling complex relationships across diverse domains.<n>With the rise of large language models (LLMs), there is growing interest in leveraging their capabilities for graph learning.<n>We propose STAG, a novel self-supervised framework that directly quantizes graph structural information into discrete tokens using a frozen codebook.
arXiv Detail & Related papers (2025-07-20T09:18:02Z)
Democratizing Large Language Model-Based Graph Data Augmentation via Latent Knowledge Graphs [22.218522445858344]
Data augmentation is necessary for graph representation learning due to the scarcity and noise present in graph data. We propose a black-box context-driven graph data augmentation approach, with the guidance of LLMs -- DemoGraph. Our approach excels in scenarios involving electronic health records (EHRs), which validates its maximal utilization of contextual knowledge.
arXiv Detail & Related papers (2025-02-19T09:00:32Z)
Graph Learning in the Era of LLMs: A Survey from the Perspective of Data, Models, and Tasks [25.720233631885726]
integration of Graph Neural Networks (GNNs) and Large Language Models (LLMs) has emerged as a promising technological paradigm. We leverage graph description texts with rich semantic context to fundamentally enhance Data quality. This work serves as a foundational reference for researchers and practitioners looking to advance graph learning methodologies.
arXiv Detail & Related papers (2024-12-17T01:41:17Z)
Verbalized Graph Representation Learning: A Fully Interpretable Graph Model Based on Large Language Models Throughout the Entire Process [8.820909397907274]
We propose a verbalized graph representation learning (VGRL) method which is fully interpretable. In contrast to traditional graph machine learning models, VGRL constrains this parameter space to be text description. We conduct several studies to empirically evaluate the effectiveness of VGRL.
arXiv Detail & Related papers (2024-10-02T12:07:47Z)
UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs [30.635472655668078]
Text-Attributed Graphs (TAGs) can generalize to unseen graphs and tasks across diverse domains. We propose a novel cascaded architecture of Language Models (LMs) and Graph Neural Networks (GNNs) as backbone networks. We demonstrate the model's effectiveness in self-supervised representation learning on unseen graphs, few-shot in-context transfer, and zero-shot transfer.
arXiv Detail & Related papers (2024-02-21T09:06:31Z)
Large Language Models on Graphs: A Comprehensive Survey [77.16803297418201]
We provide a systematic review of scenarios and techniques related to large language models on graphs. We first summarize potential scenarios of adopting LLMs on graphs into three categories, namely pure graphs, text-attributed graphs, and text-paired graphs. We discuss the real-world applications of such methods and summarize open-source codes and benchmark datasets.
arXiv Detail & Related papers (2023-12-05T14:14:27Z)
GraphextQA: A Benchmark for Evaluating Graph-Enhanced Large Language Models [33.56759621666477]
We present a benchmark dataset for evaluating the integration of graph knowledge into language models. The proposed dataset is designed to evaluate graph-language models' ability to understand graphs and make use of it for answer generation. We perform experiments with language-only models and the proposed graph-language model to validate the usefulness of the paired graphs and to demonstrate the difficulty of the task.
arXiv Detail & Related papers (2023-10-12T16:46:58Z)
Exploring Large Language Model for Graph Data Understanding in Online Job Recommendations [63.19448893196642]
We present a novel framework that harnesses the rich contextual information and semantic representations provided by large language models to analyze behavior graphs. By leveraging this capability, our framework enables personalized and accurate job recommendations for individual users.
arXiv Detail & Related papers (2023-07-10T11:29:41Z)
Iterative Zero-Shot LLM Prompting for Knowledge Graph Construction [104.29108668347727]
This paper proposes an innovative knowledge graph generation approach that leverages the potential of the latest generative large language models. The approach is conveyed in a pipeline that comprises novel iterative zero-shot and external knowledge-agnostic strategies. We claim that our proposal is a suitable solution for scalable and versatile knowledge graph construction and may be applied to different and novel contexts.
arXiv Detail & Related papers (2023-07-03T16:01:45Z)
GPT4Graph: Can Large Language Models Understand Graph Structured Data ? An Empirical Evaluation and Benchmarking [17.7473474499538]
Large language models like ChatGPT have become indispensable to artificial general intelligence. In this study, we conduct an investigation to assess the proficiency of LLMs in comprehending graph data. Our findings contribute valuable insights towards bridging the gap between language models and graph understanding.
arXiv Detail & Related papers (2023-05-24T11:53:19Z)
KGLM: Integrating Knowledge Graph Structure in Language Models for Link Prediction [0.0]
We introduce a new entity/relation embedding layer that learns to differentiate distinctive entity and relation types. We show that further pre-training the language models with this additional embedding layer using the triples extracted from the knowledge graph, followed by the standard fine-tuning phase sets a new state-of-the-art performance for the link prediction task on the benchmark datasets.
arXiv Detail & Related papers (2022-11-04T20:38:12Z)
ENT-DESC: Entity Description Generation by Exploring Knowledge Graph [53.03778194567752]
In practice, the input knowledge could be more than enough, since the output description may only cover the most significant knowledge. We introduce a large-scale and challenging dataset to facilitate the study of such a practical scenario in KG-to-text. We propose a multi-graph structure that is able to represent the original graph information more comprehensively.
arXiv Detail & Related papers (2020-04-30T14:16:19Z)
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning [73.0598186896953]
We present two self-supervised tasks learning over raw text with the guidance from knowledge graphs. Building upon entity-level masked language models, our first contribution is an entity masking scheme. In contrast to existing paradigms, our approach uses knowledge graphs implicitly, only during pre-training.
arXiv Detail & Related papers (2020-04-29T14:22:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.