Related papers: GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism

GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism

URL: http://arxiv.org/abs/2404.06911v1
Date: Wed, 10 Apr 2024 11:03:57 GMT
Title: GraSAME: Injecting Token-Level Structural Information to Pretrained Language Models via Graph-guided Self-Attention Mechanism
Authors: Shuzhou Yuan, Michael Färber,
Abstract summary: We propose a graph-guided self-attention mechanism, GraSAME, for pretrained language models. GraSAME seamlessly incorporates token-level structural information into PLMs without necessitating additional alignment or concatenation efforts. Our experiments on the graph-to-text generation task demonstrate that GraSAME outperforms baseline models and achieves results comparable to state-of-the-art (SOTA) models on WebNLG datasets.
Score: 10.573861741540853
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Pretrained Language Models (PLMs) benefit from external knowledge stored in graph structures for various downstream tasks. However, bridging the modality gap between graph structures and text remains a significant challenge. Traditional methods like linearizing graphs for PLMs lose vital graph connectivity, whereas Graph Neural Networks (GNNs) require cumbersome processes for integration into PLMs. In this work, we propose a novel graph-guided self-attention mechanism, GraSAME. GraSAME seamlessly incorporates token-level structural information into PLMs without necessitating additional alignment or concatenation efforts. As an end-to-end, lightweight multimodal module, GraSAME follows a multi-task learning strategy and effectively bridges the gap between graph and textual modalities, facilitating dynamic interactions between GNNs and PLMs. Our experiments on the graph-to-text generation task demonstrate that GraSAME outperforms baseline models and achieves results comparable to state-of-the-art (SOTA) models on WebNLG datasets. Furthermore, compared to SOTA models, GraSAME eliminates the need for extra pre-training tasks to adjust graph inputs and reduces the number of trainable parameters by over 100 million.

Related papers

Democratizing Large Language Model-Based Graph Data Augmentation via Latent Knowledge Graphs [22.218522445858344]
Data augmentation is necessary for graph representation learning due to the scarcity and noise present in graph data. We propose a black-box context-driven graph data augmentation approach, with the guidance of LLMs -- DemoGraph. Our approach excels in scenarios involving electronic health records (EHRs), which validates its maximal utilization of contextual knowledge.
arXiv Detail & Related papers (2025-02-19T09:00:32Z)
Graph Learning in the Era of LLMs: A Survey from the Perspective of Data, Models, and Tasks [25.720233631885726]
integration of Graph Neural Networks (GNNs) and Large Language Models (LLMs) has emerged as a promising technological paradigm. We leverage graph description texts with rich semantic context to fundamentally enhance Data quality. This work serves as a foundational reference for researchers and practitioners looking to advance graph learning methodologies.
arXiv Detail & Related papers (2024-12-17T01:41:17Z)
Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision? [62.12375949429938]
Building transferable Graph Neural Networks (GNNs) with CLIP pipeline is challenging because of three fundamental issues. We leverage multi-modal prompt learning to effectively adapt pre-trained GNN to downstream tasks and data. Our new paradigm embeds the graphs directly in the same space as the Large Language Models (LLMs) by learning both graph prompts and text prompts simultaneously.
arXiv Detail & Related papers (2024-12-11T08:03:35Z)
Instance-Aware Graph Prompt Learning [71.26108600288308]
We introduce Instance-Aware Graph Prompt Learning (IA-GPL) in this paper. The process involves generating intermediate prompts for each instance using a lightweight architecture. Experiments conducted on multiple datasets and settings showcase the superior performance of IA-GPL compared to state-of-the-art baselines.
arXiv Detail & Related papers (2024-11-26T18:38:38Z)
Enhance Graph Alignment for Large Language Models [33.96082485852042]
Graph-to-token approaches are popular in enabling Large Language Models to process graph information. Existing methods have a misalignment between self-supervised tasks and supervised downstream tasks. We propose Graph Alignment Large Language Models (GALLM) to benefit from aligned task templates.
arXiv Detail & Related papers (2024-10-15T07:50:34Z)
Dynamic and Textual Graph Generation Via Large-Scale LLM-based Agent Simulation [70.60461609393779]
GraphAgent-Generator (GAG) is a novel simulation-based framework for dynamic graph generation. Our framework effectively replicates seven macro-level structural characteristics in established network science theories. It supports generating graphs with up to nearly 100,000 nodes or 10 million edges, with a minimum speed-up of 90.4%.
arXiv Detail & Related papers (2024-10-13T12:57:08Z)
Joint Embeddings for Graph Instruction Tuning [0.0]
This work explores the integration of the graph modality in Large Language Models (LLMs) for general graph instruction following tasks. It aims at producing a deep learning model that enhances an underlying LLM with graph embeddings and trains it to understand them. The approach performs significantly better than a graph to text approach and remains consistent even for larger graphs.
arXiv Detail & Related papers (2024-05-31T08:26:47Z)
Can Graph Learning Improve Planning in LLM-based Agents? [61.47027387839096]
Task planning in language agents is emerging as an important research topic alongside the development of large language models (LLMs) In this paper, we explore graph learning-based methods for task planning, a direction that is to the prevalent focus on prompt design. Our interest in graph learning stems from a theoretical discovery: the biases of attention and auto-regressive loss impede LLMs' ability to effectively navigate decision-making on graphs.
arXiv Detail & Related papers (2024-05-29T14:26:24Z)
Parameter-Efficient Tuning Large Language Models for Graph Representation Learning [62.26278815157628]
We introduce Graph-aware. Efficient Fine-Tuning - GPEFT, a novel approach for efficient graph representation learning. We use a graph neural network (GNN) to encode structural information from neighboring nodes into a graph prompt. We validate our approach through comprehensive experiments conducted on 8 different text-rich graphs, observing an average improvement of 2% in hit@1 and Mean Reciprocal Rank (MRR) in link prediction evaluations.
arXiv Detail & Related papers (2024-04-28T18:36:59Z)
MuseGraph: Graph-oriented Instruction Tuning of Large Language Models for Generic Graph Mining [41.19687587548107]
Graph Neural Networks (GNNs) need to be re-trained every time when applied to different graph tasks and datasets. We propose a novel framework MuseGraph, which seamlessly integrates the strengths of GNNs and Large Language Models (LLMs) Our experimental results demonstrate significant improvements in different graph tasks.
arXiv Detail & Related papers (2024-03-02T09:27:32Z)
Efficient End-to-end Language Model Fine-tuning on Graphs [21.23522552579571]
Learning from Text-Attributed Graphs (TAGs) has attracted significant attention due to its wide range of real-world applications. We introduce LEADING, a novel and efficient approach for end-to-end fine-tuning of language models on TAGs. Our proposed approach demonstrates superior performance, achieving state-of-the-art (SOTA) results on the ogbn-arxiv leaderboard.
arXiv Detail & Related papers (2023-12-07T22:35:16Z)
GraphGPT: Graph Instruction Tuning for Large Language Models [27.036935149004726]
Graph Neural Networks (GNNs) have evolved to understand graph structures. To enhance robustness, self-supervised learning (SSL) has become a vital tool for data augmentation. Our research tackles this by advancing graph model generalization in zero-shot learning environments.
arXiv Detail & Related papers (2023-10-19T06:17:46Z)
SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning. We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task. We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z)
Tensor Graph Convolutional Networks for Multi-relational and Robust Learning [74.05478502080658]
This paper introduces a tensor-graph convolutional network (TGCN) for scalable semi-supervised learning (SSL) from data associated with a collection of graphs, that are represented by a tensor. The proposed architecture achieves markedly improved performance relative to standard GCNs, copes with state-of-the-art adversarial attacks, and leads to remarkable SSL performance over protein-to-protein interaction networks.
arXiv Detail & Related papers (2020-03-15T02:33:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.