Deep Prompt Tuning for Graph Transformers
- URL: http://arxiv.org/abs/2309.10131v1
- Date: Mon, 18 Sep 2023 20:12:17 GMT
- Title: Deep Prompt Tuning for Graph Transformers
- Authors: Reza Shirkavand, Heng Huang
- Abstract summary: Fine-tuning is resource-intensive and requires storing multiple copies of large models.
We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning.
By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
- Score: 55.2480439325792
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Graph transformers have gained popularity in various graph-based tasks by
addressing challenges faced by traditional Graph Neural Networks. However, the
quadratic complexity of self-attention operations and the extensive layering in
graph transformer architectures present challenges when applying them to graph
based prediction tasks. Fine-tuning, a common approach, is resource-intensive
and requires storing multiple copies of large models. We propose a novel
approach called deep graph prompt tuning as an alternative to fine-tuning for
leveraging large graph transformer models in downstream graph based prediction
tasks. Our method introduces trainable feature nodes to the graph and pre-pends
task-specific tokens to the graph transformer, enhancing the model's expressive
power. By freezing the pre-trained parameters and only updating the added
tokens, our approach reduces the number of free parameters and eliminates the
need for multiple model copies, making it suitable for small datasets and
scalable to large graphs. Through extensive experiments on various-sized
datasets, we demonstrate that deep graph prompt tuning achieves comparable or
even superior performance to fine-tuning, despite utilizing significantly fewer
task-specific parameters. Our contributions include the introduction of prompt
tuning for graph transformers, its application to both graph transformers and
message passing graph neural networks, improved efficiency and resource
utilization, and compelling experimental results. This work brings attention to
a promising approach to leverage pre-trained models in graph based prediction
tasks and offers new opportunities for exploring and advancing graph
representation learning.
Related papers
- Generalizing Graph Transformers Across Diverse Graphs and Tasks via Pre-Training on Industrial-Scale Data [34.21420029237621]
We introduce a scalable transformer-based graph pre-training framework called PGT (Pre-trained Graph Transformer)
Our framework achieves state-of-the-art performance on both industrial datasets and public datasets.
arXiv Detail & Related papers (2024-07-04T14:14:09Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations [75.71298846760303]
We show that a one-layer attention can bring up surprisingly competitive performance across node property prediction benchmarks.
We frame the proposed scheme as Simplified Graph Transformers (SGFormer), which is empowered by a simple attention model.
We believe the proposed methodology alone enlightens a new technical path of independent interest for building Transformers on large graphs.
arXiv Detail & Related papers (2023-06-19T08:03:25Z) - G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning
for Graph Transformer Networks [0.7118812771905295]
We show that it is sub-optimal to directly transfer existing PEFTs to graph-based tasks due to the issue of feature distribution shift.
We propose a novel structure-aware PEFT approach, named G-Adapter, to guide the updating process.
Extensive experiments demonstrate that G-Adapter obtains the state-of-the-art performance compared to the counterparts on nine graph benchmark datasets.
arXiv Detail & Related papers (2023-05-17T16:10:36Z) - GraphPrompt: Unifying Pre-Training and Downstream Tasks for Graph Neural
Networks [16.455234748896157]
GraphPrompt is a novel pre-training and prompting framework on graphs.
It unifies pre-training and downstream tasks into a common task template.
It also employs a learnable prompt to assist a downstream task in locating the most relevant knowledge from the pre-train model.
arXiv Detail & Related papers (2023-02-16T02:51:38Z) - Hierarchical Graph Transformer with Adaptive Node Sampling [19.45896788055167]
We identify the main deficiencies of current graph transformers.
Most sampling strategies only focus on local neighbors and neglect the long-range dependencies in the graph.
We propose a hierarchical attention scheme with graph coarsening to capture the long-range interactions.
arXiv Detail & Related papers (2022-10-08T05:53:25Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Dynamic Graph Representation Learning via Graph Transformer Networks [41.570839291138114]
We propose a Transformer-based dynamic graph learning method named Dynamic Graph Transformer (DGT)
DGT has spatial-temporal encoding to effectively learn graph topology and capture implicit links.
We show that DGT presents superior performance compared with several state-of-the-art baselines.
arXiv Detail & Related papers (2021-11-19T21:44:23Z) - Promoting Graph Awareness in Linearized Graph-to-Text Generation [72.83863719868364]
We study the ability of linearized models to encode local graph structures.
Our findings motivate solutions to enrich the quality of models' implicit graph encodings.
We find that these denoising scaffolds lead to substantial improvements in downstream generation in low-resource settings.
arXiv Detail & Related papers (2020-12-31T18:17:57Z) - Graph Ordering: Towards the Optimal by Learning [69.72656588714155]
Graph representation learning has achieved a remarkable success in many graph-based applications, such as node classification, prediction, and community detection.
However, for some kind of graph applications, such as graph compression and edge partition, it is very hard to reduce them to some graph representation learning tasks.
In this paper, we propose to attack the graph ordering problem behind such applications by a novel learning approach.
arXiv Detail & Related papers (2020-01-18T09:14:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.