Each Graph is a New Language: Graph Learning with LLMs
- URL: http://arxiv.org/abs/2501.11478v2
- Date: Thu, 23 Jan 2025 06:36:18 GMT
- Title: Each Graph is a New Language: Graph Learning with LLMs
- Authors: Huachi Zhou, Jiahe Du, Chuang Zhou, Chang Yang, Yilin Xiao, Yuxuan Xie, Xiao Huang,
- Abstract summary: We present textbfGraph-textbfDefined textbfLanguage for textbfLarge textbfLanguage textbfModel (GDL4LLM) to transfer powerful language understanding capabilities to graph-structured data.
GDL4LLM translates graphs into a graph language corpus instead of graph descriptions and pre-trains LLMs on this corpus to adequately understand graph structures.
- Score: 9.22463167477865
- License:
- Abstract: Recent efforts leverage Large Language Models (LLMs) for modeling text-attributed graph structures in node classification tasks. These approaches describe graph structures for LLMs to understand or aggregate LLM-generated textual attribute embeddings through graph structure. However, these approaches face two main limitations in modeling graph structures with LLMs. (i) Graph descriptions become verbose in describing high-order graph structure. (ii) Textual attributes alone do not contain adequate graph structure information. It is challenging to model graph structure concisely and adequately with LLMs. LLMs lack built-in mechanisms to model graph structures directly. They also struggle with complex long-range dependencies between high-order nodes and target nodes. Inspired by the observation that LLMs pre-trained on one language can achieve exceptional performance on another with minimal additional training, we propose \textbf{G}raph-\textbf{D}efined \textbf{L}anguage for \textbf{L}arge \textbf{L}anguage \textbf{M}odel (GDL4LLM). This novel framework enables LLMs to transfer their powerful language understanding capabilities to graph-structured data. GDL4LLM translates graphs into a graph language corpus instead of graph descriptions and pre-trains LLMs on this corpus to adequately understand graph structures. During fine-tuning, this corpus describes the structural information of target nodes concisely with only a few tokens. By treating graphs as a new language, GDL4LLM enables LLMs to model graph structures adequately and concisely for node classification tasks. Extensive experiments on three real-world datasets demonstrate that GDL4LLM outperforms description-based and textual attribute embeddings-based baselines by efficiently modeling different orders of graph structure with LLMs.
Related papers
- NT-LLM: A Novel Node Tokenizer for Integrating Graph Structure into Large Language Models [26.739650151993928]
Graphs are a fundamental data structure for representing relationships in real-world scenarios.
Applying Large Language Models (LLMs) to graph-related tasks poses significant challenges.
We introduce Node Tokenizer for Large Language Models (NT-LLM), a novel framework that efficiently encodes graph structures.
arXiv Detail & Related papers (2024-10-14T17:21:57Z) - GUNDAM: Aligning Large Language Models with Graph Understanding [10.080136100700692]
We introduce the textbfGraph textbfUnderstanding for textbfNatural Language textbfDriven textbfAnalytical textbfModel (model)
This model adapts LLMs to better understand and engage with the structure of graph data, enabling them to perform complex reasoning tasks by leveraging the graph's structure itself.
arXiv Detail & Related papers (2024-09-30T07:59:10Z) - LangTopo: Aligning Language Descriptions of Graphs with Tokenized Topological Modeling [10.907949155931474]
We introduce LangTopo, which aligns graph structure modeling with natural language understanding at the token level.
We demonstrate the effectiveness of our proposed method on multiple datasets.
arXiv Detail & Related papers (2024-06-19T06:20:22Z) - Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs [60.71360240206726]
Large language models (LLMs) suffer from hallucinations, especially on knowledge-intensive tasks.
Existing works propose to augment LLMs with individual text units retrieved from external knowledge corpora.
We propose a framework called Graph Chain-of-thought (Graph-CoT) to augment LLMs with graphs by encouraging LLMs to reason on the graph iteratively.
arXiv Detail & Related papers (2024-04-10T15:41:53Z) - GraphEdit: Large Language Models for Graph Structure Learning [62.618818029177355]
Graph Structure Learning (GSL) focuses on capturing intrinsic dependencies and interactions among nodes in graph-structured data.
Existing GSL methods heavily depend on explicit graph structural information as supervision signals.
We propose GraphEdit, an approach that leverages large language models (LLMs) to learn complex node relationships in graph-structured data.
arXiv Detail & Related papers (2024-02-23T08:29:42Z) - LLaGA: Large Language and Graph Assistant [73.71990472543027]
Large Language and Graph Assistant (LLaGA) is an innovative model to handle the complexities of graph-structured data.
LLaGA excels in versatility, generalizability and interpretability, allowing it to perform consistently well across different datasets and tasks.
Our experiments show that LLaGA delivers outstanding performance across four datasets and three tasks using one single model.
arXiv Detail & Related papers (2024-02-13T02:03:26Z) - Large Language Models on Graphs: A Comprehensive Survey [77.16803297418201]
We provide a systematic review of scenarios and techniques related to large language models on graphs.
We first summarize potential scenarios of adopting LLMs on graphs into three categories, namely pure graphs, text-attributed graphs, and text-paired graphs.
We discuss the real-world applications of such methods and summarize open-source codes and benchmark datasets.
arXiv Detail & Related papers (2023-12-05T14:14:27Z) - Disentangled Representation Learning with Large Language Models for
Text-Attributed Graphs [57.052160123387104]
We present the Disentangled Graph-Text Learner (DGTL) model, which is able to enhance the reasoning and predicting capabilities of LLMs for TAGs.
Our proposed DGTL model incorporates graph structure information through tailored disentangled graph neural network (GNN) layers.
Experimental evaluations demonstrate the effectiveness of the proposed DGTL model on achieving superior or comparable performance over state-of-the-art baselines.
arXiv Detail & Related papers (2023-10-27T14:00:04Z) - GraphGPT: Graph Instruction Tuning for Large Language Models [27.036935149004726]
Graph Neural Networks (GNNs) have evolved to understand graph structures.
To enhance robustness, self-supervised learning (SSL) has become a vital tool for data augmentation.
Our research tackles this by advancing graph model generalization in zero-shot learning environments.
arXiv Detail & Related papers (2023-10-19T06:17:46Z) - Integrating Graphs with Large Language Models: Methods and Prospects [68.37584693537555]
Large language models (LLMs) have emerged as frontrunners, showcasing unparalleled prowess in diverse applications.
Merging the capabilities of LLMs with graph-structured data has been a topic of keen interest.
This paper bifurcates such integrations into two predominant categories.
arXiv Detail & Related papers (2023-10-09T07:59:34Z) - Can LLMs Effectively Leverage Graph Structural Information through Prompts, and Why? [18.328637750057037]
Large language models (LLMs) are gaining increasing attention for their capability to process graphs with rich text attributes.
We aim to understand why the incorporation of structural information inherent in graph data can improve the prediction performance of LLMs.
arXiv Detail & Related papers (2023-09-28T16:58:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.