When to Pre-Train Graph Neural Networks? From Data Generation
Perspective!
- URL: http://arxiv.org/abs/2303.16458v4
- Date: Thu, 8 Jun 2023 17:48:57 GMT
- Title: When to Pre-Train Graph Neural Networks? From Data Generation
Perspective!
- Authors: Yuxuan Cao, Jiarong Xu, Carl Yang, Jiaan Wang, Yunchao Zhang, Chunping
Wang, Lei Chen, Yang Yang
- Abstract summary: Graph pre-training aims to acquire transferable knowledge from unlabeled graph data to improve downstream performance.
This paper introduces a generic framework W2PGNN to answer the question of when to pre-train.
W2PGNN offers three broad applications: providing the application scope of graph pre-trained models, the feasibility of pre-training, and assistance in selecting pre-training data to enhance downstream performance.
- Score: 19.239863500722983
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In recent years, graph pre-training has gained significant attention,
focusing on acquiring transferable knowledge from unlabeled graph data to
improve downstream performance. Despite these recent endeavors, the problem of
negative transfer remains a major concern when utilizing graph pre-trained
models to downstream tasks. Previous studies made great efforts on the issue of
what to pre-train and how to pre-train by designing a variety of graph
pre-training and fine-tuning strategies. However, there are cases where even
the most advanced "pre-train and fine-tune" paradigms fail to yield distinct
benefits. This paper introduces a generic framework W2PGNN to answer the
crucial question of when to pre-train (i.e., in what situations could we take
advantage of graph pre-training) before performing effortful pre-training or
fine-tuning. We start from a new perspective to explore the complex generative
mechanisms from the pre-training data to downstream data. In particular, W2PGNN
first fits the pre-training data into graphon bases, each element of graphon
basis (i.e., a graphon) identifies a fundamental transferable pattern shared by
a collection of pre-training graphs. All convex combinations of graphon bases
give rise to a generator space, from which graphs generated form the solution
space for those downstream data that can benefit from pre-training. In this
manner, the feasibility of pre-training can be quantified as the generation
probability of the downstream data from any generator in the generator space.
W2PGNN offers three broad applications: providing the application scope of
graph pre-trained models, quantifying the feasibility of pre-training, and
assistance in selecting pre-training data to enhance downstream performance. We
provide a theoretically sound solution for the first application and extensive
empirical justifications for the latter two applications.
Related papers
- Generalizing Graph Transformers Across Diverse Graphs and Tasks via Pre-Training on Industrial-Scale Data [34.21420029237621]
We introduce a scalable transformer-based graph pre-training framework called PGT (Pre-trained Graph Transformer)
Our framework achieves state-of-the-art performance on both industrial datasets and public datasets.
arXiv Detail & Related papers (2024-07-04T14:14:09Z) - Fine-tuning Graph Neural Networks by Preserving Graph Generative
Patterns [13.378277755978258]
We show that the structural divergence between pre-training and downstream graphs significantly limits the transferability when using the vanilla fine-tuning strategy.
We propose G-Tuning to preserve the generative patterns of downstream graphs.
G-Tuning demonstrates an average improvement of 0.5% and 2.6% on in-domain and out-of-domain transfer learning experiments.
arXiv Detail & Related papers (2023-12-21T05:17:10Z) - Better with Less: A Data-Active Perspective on Pre-Training Graph Neural
Networks [39.71761440499148]
Pre-training on graph neural networks (GNNs) aims to learn transferable knowledge for downstream tasks with unlabeled data.
We propose a better-with-less framework for graph pre-training: fewer, but carefully chosen data are fed into a GNN model.
Experiment results show that the proposed APT is able to obtain an efficient pre-training model with fewer training data and better downstream performance.
arXiv Detail & Related papers (2023-11-02T07:09:59Z) - Deep Prompt Tuning for Graph Transformers [55.2480439325792]
Fine-tuning is resource-intensive and requires storing multiple copies of large models.
We propose a novel approach called deep graph prompt tuning as an alternative to fine-tuning.
By freezing the pre-trained parameters and only updating the added tokens, our approach reduces the number of free parameters and eliminates the need for multiple model copies.
arXiv Detail & Related papers (2023-09-18T20:12:17Z) - DiP-GNN: Discriminative Pre-Training of Graph Neural Networks [49.19824331568713]
Graph neural network (GNN) pre-training methods have been proposed to enhance the power of GNNs.
One popular pre-training method is to mask out a proportion of the edges, and a GNN is trained to recover them.
In our framework, the graph seen by the discriminator better matches the original graph because the generator can recover a proportion of the masked edges.
arXiv Detail & Related papers (2022-09-15T17:41:50Z) - MentorGNN: Deriving Curriculum for Pre-Training GNNs [61.97574489259085]
We propose an end-to-end model named MentorGNN that aims to supervise the pre-training process of GNNs across graphs.
We shed new light on the problem of domain adaption on relational data (i.e., graphs) by deriving a natural and interpretable upper bound on the generalization error of the pre-trained GNNs.
arXiv Detail & Related papers (2022-08-21T15:12:08Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Bringing Your Own View: Graph Contrastive Learning without Prefabricated
Data Augmentations [94.41860307845812]
Self-supervision is recently surging at its new frontier of graph learning.
GraphCL uses a prefabricated prior reflected by the ad-hoc manual selection of graph data augmentations.
We have extended the prefabricated discrete prior in the augmentation set, to a learnable continuous prior in the parameter space of graph generators.
We have leveraged both principles of information minimization (InfoMin) and information bottleneck (InfoBN) to regularize the learned priors.
arXiv Detail & Related papers (2022-01-04T15:49:18Z) - An Adaptive Graph Pre-training Framework for Localized Collaborative
Filtering [79.17319280791237]
We propose an adaptive graph pre-training framework for localized collaborative filtering (ADAPT)
ADAPT captures both the common knowledge across different graphs and the uniqueness for each graph.
It does not require transferring user/item embeddings, and is able to capture both the common knowledge across different graphs and the uniqueness for each graph.
arXiv Detail & Related papers (2021-12-14T06:53:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.