Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining
- URL: http://arxiv.org/abs/2403.04780v3
- Date: Sun, 24 Aug 2025 14:30:52 GMT
- Title: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining
- Authors: Yanchao Tan, Hang Lv, Pengxiang Zhan, Shiping Wang, Carl Yang,
- Abstract summary: MuseGraph is a framework for graph mining across tasks and datasets.<n>It seamlessly integrates the strengths of Graph Neural Networks (GNNs) and Large Language Models (LLMs)<n>Our experimental results demonstrate significant improvements in five graph tasks and ten datasets.
- Score: 36.366468326267004
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graphs with abundant attributes are essential in modeling interconnected entities and enhancing predictions across various real-world applications. Traditional Graph Neural Networks (GNNs) often require re-training for different graph tasks and datasets. Although the emergence of Large Language Models (LLMs) has introduced new paradigms in natural language processing, their potential for generic graph mining, training a single model to simultaneously handle diverse tasks and datasets, remains under-explored. To this end, our novel framework MuseGraph, seamlessly integrates the strengths of GNNs and LLMs into one foundation model for graph mining across tasks and datasets. This framework first features a compact graph description to encapsulate key graph information within language token limitations. Then, we propose a diverse instruction generation mechanism with Chain-of-Thought (CoT)-based instruction packages to distill the reasoning capabilities from advanced LLMs like GPT-4. Finally, we design a graph-aware instruction tuning strategy to facilitate mutual enhancement across multiple tasks and datasets while preventing catastrophic forgetting of LLMs' generative abilities. Our experimental results demonstrate significant improvements in five graph tasks and ten datasets, showcasing the potential of our MuseGraph in enhancing the accuracy of graph-oriented downstream tasks while improving the generation abilities of LLMs.
Related papers
- GILT: An LLM-Free, Tuning-Free Graph Foundational Model for In-Context Learning [50.40400074353263]
Graph Neural Networks (GNNs) are powerful tools for precessing relational data but often struggle to generalize to unseen graphs.<n>We introduce textbfGraph textbfIn-context textbfL textbfTransformer (GILT), a framework built on an LLM-free and tuning-free architecture.
arXiv Detail & Related papers (2025-10-06T08:09:15Z) - G1: Teaching LLMs to Reason on Graphs with Reinforcement Learning [58.73279333365234]
Reinforcement Learning (RL) on synthetic graph-theoretic tasks can significantly scale graph reasoning abilities.<n>With RL on Erdos, G1 obtains substantial improvements in graph reasoning, where our finetuned 3B model even outperforms Qwen2.5-72B-Instruct (24x size)<n>Our findings offer an efficient, scalable path for building strong graph reasoners by finetuning LLMs with RL on graph-theoretic tasks.
arXiv Detail & Related papers (2025-05-24T04:33:41Z) - Revisiting Graph Neural Networks on Graph-level Tasks: Comprehensive Experiments, Analysis, and Improvements [54.006506479865344]
We propose a unified evaluation framework for graph-level Graph Neural Networks (GNNs)
This framework provides a standardized setting to evaluate GNNs across diverse datasets.
We also propose a novel GNN model with enhanced expressivity and generalization capabilities.
arXiv Detail & Related papers (2025-01-01T08:48:53Z) - Towards Graph Foundation Models: Learning Generalities Across Graphs via Task-Trees [50.78679002846741]
We introduce a novel approach for learning cross-task generalities in graphs.
We propose task-trees as basic learning instances to align task spaces on graphs.
Our findings indicate that when a graph neural network is pretrained on diverse task-trees, it acquires transferable knowledge.
arXiv Detail & Related papers (2024-12-21T02:07:43Z) - Graph Learning in the Era of LLMs: A Survey from the Perspective of Data, Models, and Tasks [25.720233631885726]
integration of Graph Neural Networks (GNNs) and Large Language Models (LLMs) has emerged as a promising technological paradigm.<n>We leverage graph description texts with rich semantic context to fundamentally enhance Data quality.<n>This work serves as a foundational reference for researchers and practitioners looking to advance graph learning methodologies.
arXiv Detail & Related papers (2024-12-17T01:41:17Z) - A Hierarchical Language Model For Interpretable Graph Reasoning [47.460255447561906]
We introduce Hierarchical Language Model for Graph (HLM-G), which employs a two-block architecture to capture node-centric local information and interaction-centric global structure.
The proposed scheme allows LLMs to address various graph queries with high efficacy, efficiency, and robustness, while reducing computational costs on large-scale graph tasks.
Comprehensive evaluations across diverse graph reasoning and real-world tasks of node, link, and graph-levels highlight the superiority of our method.
arXiv Detail & Related papers (2024-10-29T00:28:02Z) - Dynamic and Textual Graph Generation Via Large-Scale LLM-based Agent Simulation [70.60461609393779]
GraphAgent-Generator (GAG) is a novel simulation-based framework for dynamic graph generation.
Our framework effectively replicates seven macro-level structural characteristics in established network science theories.
It supports generating graphs with up to nearly 100,000 nodes or 10 million edges, with a minimum speed-up of 90.4%.
arXiv Detail & Related papers (2024-10-13T12:57:08Z) - How Do Large Language Models Understand Graph Patterns? A Benchmark for Graph Pattern Comprehension [53.6373473053431]
This work introduces a benchmark to assess large language models' capabilities in graph pattern tasks.<n>We have developed a benchmark that evaluates whether LLMs can understand graph patterns based on either terminological or topological descriptions.<n>Our benchmark encompasses both synthetic and real datasets, and a variety of models, with a total of 11 tasks and 7 models.
arXiv Detail & Related papers (2024-10-04T04:48:33Z) - A Survey of Large Language Models for Graphs [21.54279919476072]
We conduct an in-depth review of the latest state-of-the-art Large Language Models applied in graph learning.
We introduce a novel taxonomy to categorize existing methods based on their framework design.
We explore the strengths and limitations of each framework, and emphasize potential avenues for future research.
arXiv Detail & Related papers (2024-05-10T18:05:37Z) - Parameter-Efficient Tuning Large Language Models for Graph Representation Learning [62.26278815157628]
We introduce Graph-aware.
Efficient Fine-Tuning - GPEFT, a novel approach for efficient graph representation learning.
We use a graph neural network (GNN) to encode structural information from neighboring nodes into a graph prompt.
We validate our approach through comprehensive experiments conducted on 8 different text-rich graphs, observing an average improvement of 2% in hit@1 and Mean Reciprocal Rank (MRR) in link prediction evaluations.
arXiv Detail & Related papers (2024-04-28T18:36:59Z) - GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism [10.573861741540853]
We propose a graph-guided self-attention mechanism, GraSAME, for pretrained language models.
GraSAME seamlessly incorporates token-level structural information into PLMs without necessitating additional alignment or concatenation efforts.
Our experiments on the graph-to-text generation task demonstrate that GraSAME outperforms baseline models and achieves results comparable to state-of-the-art (SOTA) models on WebNLG datasets.
arXiv Detail & Related papers (2024-04-10T11:03:57Z) - UniGraph: Learning a Unified Cross-Domain Foundation Model for Text-Attributed Graphs [30.635472655668078]
Text-Attributed Graphs (TAGs) can generalize to unseen graphs and tasks across diverse domains.
We propose a novel cascaded architecture of Language Models (LMs) and Graph Neural Networks (GNNs) as backbone networks.
We demonstrate the model's effectiveness in self-supervised representation learning on unseen graphs, few-shot in-context transfer, and zero-shot transfer.
arXiv Detail & Related papers (2024-02-21T09:06:31Z) - InstructGraph: Boosting Large Language Models via Graph-centric
Instruction Tuning and Preference Alignment [30.136514352238795]
InstructGraph is a framework that empowers large language models with the abilities of graph reasoning and generation.
We show that InstructGraph can achieve the best performance and outperform GPT-4 and LLaMA2 by more than 13% and 38%, respectively.
arXiv Detail & Related papers (2024-02-13T20:47:17Z) - LLaGA: Large Language and Graph Assistant [73.71990472543027]
Large Language and Graph Assistant (LLaGA) is an innovative model to handle the complexities of graph-structured data.
LLaGA excels in versatility, generalizability and interpretability, allowing it to perform consistently well across different datasets and tasks.
Our experiments show that LLaGA delivers outstanding performance across four datasets and three tasks using one single model.
arXiv Detail & Related papers (2024-02-13T02:03:26Z) - GraphGPT: Graph Instruction Tuning for Large Language Models [27.036935149004726]
Graph Neural Networks (GNNs) have evolved to understand graph structures.
To enhance robustness, self-supervised learning (SSL) has become a vital tool for data augmentation.
Our research tackles this by advancing graph model generalization in zero-shot learning environments.
arXiv Detail & Related papers (2023-10-19T06:17:46Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - GraphGLOW: Universal and Generalizable Structure Learning for Graph
Neural Networks [72.01829954658889]
This paper introduces the mathematical definition of this novel problem setting.
We devise a general framework that coordinates a single graph-shared structure learner and multiple graph-specific GNNs.
The well-trained structure learner can directly produce adaptive structures for unseen target graphs without any fine-tuning.
arXiv Detail & Related papers (2023-06-20T03:33:22Z) - GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training [62.73470368851127]
Graph representation learning has emerged as a powerful technique for addressing real-world problems.
We design Graph Contrastive Coding -- a self-supervised graph neural network pre-training framework.
We conduct experiments on three graph learning tasks and ten graph datasets.
arXiv Detail & Related papers (2020-06-17T16:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.