GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism
- URL: http://arxiv.org/abs/2404.06911v1
- Date: Wed, 10 Apr 2024 11:03:57 GMT
- Title: GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism
- Authors: Shuzhou Yuan, Michael Färber,
- Abstract summary: We propose a graph-guided self-attention mechanism, GraSAME, for pretrained language models.
GraSAME seamlessly incorporates token-level structural information into PLMs without necessitating additional alignment or concatenation efforts.
Our experiments on the graph-to-text generation task demonstrate that GraSAME outperforms baseline models and achieves results comparable to state-of-the-art (SOTA) models on WebNLG datasets.
- Score: 10.573861741540853
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pretrained Language Models (PLMs) benefit from external knowledge stored in graph structures for various downstream tasks. However, bridging the modality gap between graph structures and text remains a significant challenge. Traditional methods like linearizing graphs for PLMs lose vital graph connectivity, whereas Graph Neural Networks (GNNs) require cumbersome processes for integration into PLMs. In this work, we propose a novel graph-guided self-attention mechanism, GraSAME. GraSAME seamlessly incorporates token-level structural information into PLMs without necessitating additional alignment or concatenation efforts. As an end-to-end, lightweight multimodal module, GraSAME follows a multi-task learning strategy and effectively bridges the gap between graph and textual modalities, facilitating dynamic interactions between GNNs and PLMs. Our experiments on the graph-to-text generation task demonstrate that GraSAME outperforms baseline models and achieves results comparable to state-of-the-art (SOTA) models on WebNLG datasets. Furthermore, compared to SOTA models, GraSAME eliminates the need for extra pre-training tasks to adjust graph inputs and reduces the number of trainable parameters by over 100 million.
Related papers
- Enhance Graph Alignment for Large Language Models [33.96082485852042]
Graph-to-token approaches are popular in enabling Large Language Models to process graph information.
Existing methods have a misalignment between self-supervised tasks and supervised downstream tasks.
We propose Graph Alignment Large Language Models (GALLM) to benefit from aligned task templates.
arXiv Detail & Related papers (2024-10-15T07:50:34Z) - Dynamic and Textual Graph Generation Via Large-Scale LLM-based Agent Simulation [70.60461609393779]
GraphAgent-Generator (GAG) is a novel simulation-based framework for dynamic graph generation.
Our framework effectively replicates seven macro-level structural characteristics in established network science theories.
It supports generating graphs with up to nearly 100,000 nodes or 10 million edges, with a minimum speed-up of 90.4%.
arXiv Detail & Related papers (2024-10-13T12:57:08Z) - Joint Embeddings for Graph Instruction Tuning [0.0]
This work explores the integration of the graph modality in Large Language Models (LLMs) for general graph instruction following tasks.
It aims at producing a deep learning model that enhances an underlying LLM with graph embeddings and trains it to understand them.
The approach performs significantly better than a graph to text approach and remains consistent even for larger graphs.
arXiv Detail & Related papers (2024-05-31T08:26:47Z) - Can Graph Learning Improve Planning in LLM-based Agents? [61.47027387839096]
Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs)
In this paper, we explore graph learning-based methods for task planning, a direction that is to the prevalent focus on prompt design.
Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs.
arXiv Detail & Related papers (2024-05-29T14:26:24Z) - Parameter-Efficient Tuning Large Language Models for Graph Representation Learning [62.26278815157628]
We introduce Graph-aware.
Efficient Fine-Tuning - GPEFT, a novel approach for efficient graph representation learning.
We use a graph neural network (GNN) to encode structural information from neighboring nodes into a graph prompt.
We validate our approach through comprehensive experiments conducted on 8 different text-rich graphs, observing an average improvement of 2% in hit@1 and Mean Reciprocal Rank (MRR) in link prediction evaluations.
arXiv Detail & Related papers (2024-04-28T18:36:59Z) - MuseGraph: Graph-oriented Instruction Tuning of Large Language Models
for Generic Graph Mining [41.19687587548107]
Graph Neural Networks (GNNs) need to be re-trained every time when applied to different graph tasks and datasets.
We propose a novel framework MuseGraph, which seamlessly integrates the strengths of GNNs and Large Language Models (LLMs)
Our experimental results demonstrate significant improvements in different graph tasks.
arXiv Detail & Related papers (2024-03-02T09:27:32Z) - Efficient End-to-end Language Model Fine-tuning on Graphs [21.23522552579571]
Learning from Text-Attributed Graphs (TAGs) has attracted significant attention due to its wide range of real-world applications.
We introduce LEADING, a novel and efficient approach for end-to-end fine-tuning of language models on TAGs.
Our proposed approach demonstrates superior performance, achieving state-of-the-art (SOTA) results on the ogbn-arxiv leaderboard.
arXiv Detail & Related papers (2023-12-07T22:35:16Z) - GraphGPT: Graph Instruction Tuning for Large Language Models [27.036935149004726]
Graph Neural Networks (GNNs) have evolved to understand graph structures.
To enhance robustness, self-supervised learning (SSL) has become a vital tool for data augmentation.
Our research tackles this by advancing graph model generalization in zero-shot learning environments.
arXiv Detail & Related papers (2023-10-19T06:17:46Z) - Integrating Graphs with Large Language Models: Methods and Prospects [68.37584693537555]
Large language models (LLMs) have emerged as frontrunners, showcasing unparalleled prowess in diverse applications.
Merging the capabilities of LLMs with graph-structured data has been a topic of keen interest.
This paper bifurcates such integrations into two predominant categories.
arXiv Detail & Related papers (2023-10-09T07:59:34Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - Tensor Graph Convolutional Networks for Multi-relational and Robust
Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor.
The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.