HG-Adapter: Improving Pre-Trained Heterogeneous Graph Neural Networks with Dual Adapters
- URL: http://arxiv.org/abs/2411.01155v1
- Date: Sat, 02 Nov 2024 06:43:54 GMT
- Title: HG-Adapter: Improving Pre-Trained Heterogeneous Graph Neural Networks with Dual Adapters
- Authors: Yujie Mo, Runpeng Yu, Xiaofeng Zhu, Xinchao Wang,
- Abstract summary: "pre-train, prompt-tuning" has demonstrated impressive performance for tuning pre-trained heterogeneous graph neural networks (HGNNs)
We propose a unified framework that combines two new adapters with potential labeled data extension to improve the generalization of pre-trained HGNN models.
- Score: 53.97380482341493
- License:
- Abstract: The "pre-train, prompt-tuning'' paradigm has demonstrated impressive performance for tuning pre-trained heterogeneous graph neural networks (HGNNs) by mitigating the gap between pre-trained models and downstream tasks. However, most prompt-tuning-based works may face at least two limitations: (i) the model may be insufficient to fit the graph structures well as they are generally ignored in the prompt-tuning stage, increasing the training error to decrease the generalization ability; and (ii) the model may suffer from the limited labeled data during the prompt-tuning stage, leading to a large generalization gap between the training error and the test error to further affect the model generalization. To alleviate the above limitations, we first derive the generalization error bound for existing prompt-tuning-based methods, and then propose a unified framework that combines two new adapters with potential labeled data extension to improve the generalization of pre-trained HGNN models. Specifically, we design dual structure-aware adapters to adaptively fit task-related homogeneous and heterogeneous structural information. We further design a label-propagated contrastive loss and two self-supervised losses to optimize dual adapters and incorporate unlabeled nodes as potential labeled data. Theoretical analysis indicates that the proposed method achieves a lower generalization error bound than existing methods, thus obtaining superior generalization ability. Comprehensive experiments demonstrate the effectiveness and generalization of the proposed method on different downstream tasks.
Related papers
- Towards Generalizable Trajectory Prediction Using Dual-Level Representation Learning And Adaptive Prompting [107.4034346788744]
Existing vehicle trajectory prediction models struggle with generalizability, prediction uncertainties, and handling complex interactions.
We propose Perceiver with Register queries (PerReg+), a novel trajectory prediction framework that introduces: (1) Dual-Level Representation Learning via Self-Distillation (SD) and Masked Reconstruction (MR), capturing global context and fine-grained details; (2) Enhanced Multimodality using register-based queries and pretraining, eliminating the need for clustering and suppression; and (3) Adaptive Prompt Tuning during fine-tuning, freezing the main architecture and optimizing a small number of prompts for efficient adaptation.
arXiv Detail & Related papers (2025-01-08T20:11:09Z) - HERTA: A High-Efficiency and Rigorous Training Algorithm for Unfolded Graph Neural Networks [14.139047596566485]
HERTA is a high-efficiency and rigorous training algorithm for Unfolded GNNs.
HERTA converges to the optimum of the original model, thus preserving the interpretability of Unfolded GNNs.
As a byproduct of HERTA, we propose a new spectral sparsification method applicable to normalized and regularized graph Laplacians.
arXiv Detail & Related papers (2024-03-26T23:03:06Z) - On the Generalization Ability of Unsupervised Pretraining [53.06175754026037]
Recent advances in unsupervised learning have shown that unsupervised pre-training, followed by fine-tuning, can improve model generalization.
This paper introduces a novel theoretical framework that illuminates the critical factor influencing the transferability of knowledge acquired during unsupervised pre-training to the subsequent fine-tuning phase.
Our results contribute to a better understanding of unsupervised pre-training and fine-tuning paradigm, and can shed light on the design of more effective pre-training algorithms.
arXiv Detail & Related papers (2024-03-11T16:23:42Z) - Enhancing Generalization in Medical Visual Question Answering Tasks via
Gradient-Guided Model Perturbation [16.22199565010318]
We introduce a method that incorporates gradient-guided perturbations to the visual encoder of the multimodality model during both pre-training and fine-tuning phases.
The results show that, even with a significantly smaller pre-training image caption dataset, our approach achieves competitive outcomes.
arXiv Detail & Related papers (2024-03-05T06:57:37Z) - CAP: Co-Adversarial Perturbation on Weights and Features for Improving
Generalization of Graph Neural Networks [59.692017490560275]
Adversarial training has been widely demonstrated to improve model's robustness against adversarial attacks.
It remains unclear how the adversarial training could improve the generalization abilities of GNNs in the graph analytics problem.
We construct the co-adversarial perturbation (CAP) optimization problem in terms of weights and features, and design the alternating adversarial perturbation algorithm to flatten the weight and feature loss landscapes alternately.
arXiv Detail & Related papers (2021-10-28T02:28:13Z) - Double Descent and Other Interpolation Phenomena in GANs [2.7007335372861974]
We study the generalization error as a function of latent space dimension in generative adversarial networks (GANs)
We develop a novel pseudo-supervised learning approach for GANs where the training utilizes pairs of fabricated (noise) inputs in conjunction with real output samples.
While our analysis focuses mostly on linear models, we also apply important insights for improving generalization of nonlinear, multilayer GANs.
arXiv Detail & Related papers (2021-06-07T23:07:57Z) - Bi-tuning of Pre-trained Representations [79.58542780707441]
Bi-tuning is a general learning framework to fine-tune both supervised and unsupervised pre-trained representations to downstream tasks.
Bi-tuning generalizes the vanilla fine-tuning by integrating two heads upon the backbone of pre-trained representations.
Bi-tuning achieves state-of-the-art results for fine-tuning tasks of both supervised and unsupervised pre-trained models by large margins.
arXiv Detail & Related papers (2020-11-12T03:32:25Z) - CASTLE: Regularization via Auxiliary Causal Graph Discovery [89.74800176981842]
We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables.
CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features.
arXiv Detail & Related papers (2020-09-28T09:49:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.