Topology Only Pre-Training: Towards Generalised Multi-Domain Graph Models
- URL: http://arxiv.org/abs/2311.03976v4
- Date: Mon, 02 Dec 2024 12:44:25 GMT
- Title: Topology Only Pre-Training: Towards Generalised Multi-Domain Graph Models
- Authors: Alex O. Davies, Riku W. Green, Nirav S. Ajmeri, Telmo M. Silva Filho,
- Abstract summary: We present Topology Only Pre-Training (ToP), a graph pre-training method based on node and edge feature exclusion.
We show positive transfer on evaluation datasets from multiple domains, including domains not present in pre-training data.
- Score: 0.0
- License:
- Abstract: The principal benefit of unsupervised representation learning is that a pre-trained model can be fine-tuned where data or labels are scarce. Existing approaches for graph representation learning are domain specific, maintaining consistent node and edge features across the pre-training and target datasets. This has precluded transfer to multiple domains. We present Topology Only Pre-Training (ToP), a graph pre-training method based on node and edge feature exclusion. We show positive transfer on evaluation datasets from multiple domains, including domains not present in pre-training data, running directly contrary to assumptions made in contemporary works. On 75% of experiments, ToP models perform significantly $p \leq 0.01$ better than a supervised baseline. Performance is significantly positive on 85.7% of tasks when node and edge features are used in fine-tuning. We further show that out-of-domain topologies can produce more useful pre-training than in-domain. Under ToP we show better transfer from non-molecule pre-training, compared to molecule pre-training, on 79% of molecular benchmarks. Against the limited set of other generalist graph models ToP performs strongly, including against models with many orders of magnitude larger. These findings show that ToP opens broad areas of research in both transfer learning on scarcely populated graph domains and in graph foundation models.
Related papers
- Is Large-Scale Pretraining the Secret to Good Domain Generalization? [69.80606575323691]
Multi-Source Domain Generalization (DG) is the task of training on multiple source domains and achieving high classification performance on unseen target domains.
Recent methods combine robust features from web-scale pretrained backbones with new features learned from source data, and this has dramatically improved benchmark results.
We show that all evaluated DG methods struggle on DomainBed-OOP, while recent methods excel on DomainBed-IP.
arXiv Detail & Related papers (2024-12-03T21:43:11Z) - One Model for One Graph: A New Perspective for Pretraining with Cross-domain Graphs [61.9759512646523]
Graph Neural Networks (GNNs) have emerged as a powerful tool to capture intricate network patterns.
Existing GNNs require careful domain-specific architecture designs and training from scratch on each dataset.
We propose a novel cross-domain pretraining framework, "one model for one graph"
arXiv Detail & Related papers (2024-11-30T01:49:45Z) - Better with Less: A Data-Active Perspective on Pre-Training Graph Neural
Networks [39.71761440499148]
Pre-training on graph neural networks (GNNs) aims to learn transferable knowledge for downstream tasks with unlabeled data.
We propose a better-with-less framework for graph pre-training: fewer, but carefully chosen data are fed into a GNN model.
Experiment results show that the proposed APT is able to obtain an efficient pre-training model with fewer training data and better downstream performance.
arXiv Detail & Related papers (2023-11-02T07:09:59Z) - When to Pre-Train Graph Neural Networks? From Data Generation
Perspective! [19.239863500722983]
Graph pre-training aims to acquire transferable knowledge from unlabeled graph data to improve downstream performance.
This paper introduces a generic framework W2PGNN to answer the question of when to pre-train.
W2PGNN offers three broad applications: providing the application scope of graph pre-trained models, the feasibility of pre-training, and assistance in selecting pre-training data to enhance downstream performance.
arXiv Detail & Related papers (2023-03-29T05:05:02Z) - Improving Domain Generalization with Domain Relations [77.63345406973097]
This paper focuses on domain shifts, which occur when the model is applied to new domains that are different from the ones it was trained on.
We propose a new approach called D$3$G to learn domain-specific models.
Our results show that D$3$G consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-02-06T08:11:16Z) - Similarity-aware Positive Instance Sampling for Graph Contrastive
Pre-training [82.68805025636165]
We propose to select positive graph instances directly from existing graphs in the training set.
Our selection is based on certain domain-specific pair-wise similarity measurements.
Besides, we develop an adaptive node-level pre-training method to dynamically mask nodes to distribute them evenly in the graph.
arXiv Detail & Related papers (2022-06-23T20:12:51Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Pre-training Graph Neural Network for Cross Domain Recommendation [58.057687253631826]
A recommender system predicts users' potential interests in items, where the core is to learn user/item embeddings.
Inspired by the contemporary arts in pre-training from graph representation learning, we propose a pre-training and fine-tuning diagram for cross-domain recommendation.
We devise a novel Pre-training Graph Neural Network for Cross-Domain Recommendation (PCRec), which adopts the contrastive self-supervised pre-training of a graph encoder.
arXiv Detail & Related papers (2021-11-16T07:34:42Z) - Efficient Variational Graph Autoencoders for Unsupervised Cross-domain
Prerequisite Chains [3.358838755118655]
We introduce Domain-versaational Variational Graph Autoencoders (DAVGAE) to solve this cross-domain prerequisite chain learning task efficiently.
Our novel model consists of a variational graph autoencoder (VGAE) and a domain discriminator.
Results show that our model outperforms recent graph-based computation using only 1/10 graph scale and 1/3 time.
arXiv Detail & Related papers (2021-09-17T19:07:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.