G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning
for Graph Transformer Networks
- URL: http://arxiv.org/abs/2305.10329v1
- Date: Wed, 17 May 2023 16:10:36 GMT
- Title: G-Adapter: Towards Structure-Aware Parameter-Efficient Transfer Learning
for Graph Transformer Networks
- Authors: Anchun Gui, Jinqiang Ye and Han Xiao
- Abstract summary: We show that it is sub-optimal to directly transfer existing PEFTs to graph-based tasks due to the issue of feature distribution shift.
We propose a novel structure-aware PEFT approach, named G-Adapter, to guide the updating process.
Extensive experiments demonstrate that G-Adapter obtains the state-of-the-art performance compared to the counterparts on nine graph benchmark datasets.
- Score: 0.7118812771905295
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It has become a popular paradigm to transfer the knowledge of large-scale
pre-trained models to various downstream tasks via fine-tuning the entire model
parameters. However, with the growth of model scale and the rising number of
downstream tasks, this paradigm inevitably meets the challenges in terms of
computation consumption and memory footprint issues. Recently,
Parameter-Efficient Fine-Tuning (PEFT) (e.g., Adapter, LoRA, BitFit) shows a
promising paradigm to alleviate these concerns by updating only a portion of
parameters. Despite these PEFTs having demonstrated satisfactory performance in
natural language processing, it remains under-explored for the question of
whether these techniques could be transferred to graph-based tasks with Graph
Transformer Networks (GTNs). Therefore, in this paper, we fill this gap by
providing extensive benchmarks with traditional PEFTs on a range of graph-based
downstream tasks. Our empirical study shows that it is sub-optimal to directly
transfer existing PEFTs to graph-based tasks due to the issue of feature
distribution shift. To address this issue, we propose a novel structure-aware
PEFT approach, named G-Adapter, which leverages graph convolution operation to
introduce graph structure (e.g., graph adjacent matrix) as an inductive bias to
guide the updating process. Besides, we propose Bregman proximal point
optimization to further alleviate feature distribution shift by preventing the
model from aggressive update. Extensive experiments demonstrate that G-Adapter
obtains the state-of-the-art performance compared to the counterparts on nine
graph benchmark datasets based on two pre-trained GTNs, and delivers tremendous
memory footprint efficiency compared to the conventional paradigm.
Related papers
- A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior.
Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks.
GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z) - Endowing Pre-trained Graph Models with Provable Fairness [49.8431177748876]
We propose a novel adapter-tuning framework that endows pre-trained graph models with provable fairness (called GraphPAR)
Specifically, we design a sensitive semantic augmenter on node representations, to extend the node representations with different sensitive attribute semantics for each node.
With GraphPAR, we quantify whether the fairness of each node is provable, i.e., predictions are always fair within a certain range of sensitive attribute semantics.
arXiv Detail & Related papers (2024-02-19T14:16:08Z) - Sparse is Enough in Fine-tuning Pre-trained Large Language Models [98.46493578509039]
We propose a gradient-based sparse fine-tuning algorithm, named Sparse Increment Fine-Tuning (SIFT)
We validate its effectiveness on a range of tasks including the GLUE Benchmark and Instruction-tuning.
arXiv Detail & Related papers (2023-12-19T06:06:30Z) - p-Laplacian Adaptation for Generative Pre-trained Vision-Language Models [10.713680139939354]
Vision-Language models (VLMs) pre-trained on large corpora have demonstrated notable success across a range of downstream tasks.
PETL has garnered attention as a viable alternative to full fine-tuning.
We propose a new adapter architecture, $p$-adapter, which employs $p$-Laplacian message passing in Graph Neural Networks (GNNs)
arXiv Detail & Related papers (2023-12-17T05:30:35Z) - HetGPT: Harnessing the Power of Prompt Tuning in Pre-Trained
Heterogeneous Graph Neural Networks [24.435068514392487]
HetGPT is a post-training prompting framework for graph neural networks.
It improves the performance of state-of-the-art HGNNs on semi-supervised node classification.
arXiv Detail & Related papers (2023-10-23T19:35:57Z) - Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models.
We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning.
By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - Fine-Tuning Graph Neural Networks via Graph Topology induced Optimal
Transport [28.679909084727594]
GTOT-Tuning is required to utilize the property of graph data to enhance the preservation of representation produced by fine-tuned networks.
By using the adjacency relationship amongst nodes, the GTOT regularizer achieves node-level optimal transport procedures.
We evaluate GTOT-Tuning on eight downstream tasks with various GNN backbones and demonstrate that it achieves state-of-the-art fine-tuning performance for GNNs.
arXiv Detail & Related papers (2022-03-20T04:41:17Z) - Gophormer: Ego-Graph Transformer for Node Classification [27.491500255498845]
In this paper, we propose a novel Gophormer model which applies transformers on ego-graphs instead of full-graphs.
Specifically, Node2Seq module is proposed to sample ego-graphs as the input of transformers, which alleviates the challenge of scalability.
In order to handle the uncertainty introduced by the ego-graph sampling, we propose a consistency regularization and a multi-sample inference strategy.
arXiv Detail & Related papers (2021-10-25T16:43:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.