Enhanced Pre-training of Graph Neural Networks for Million-Scale Heterogeneous Graphs
- URL: http://arxiv.org/abs/2510.12401v1
- Date: Tue, 14 Oct 2025 11:31:04 GMT
- Title: Enhanced Pre-training of Graph Neural Networks for Million-Scale Heterogeneous Graphs
- Authors: Shengyin Sun, Chen Ma, Jiehao Chen,
- Abstract summary: We propose an effective framework to pre-train graph neural networks (GNNs) on a large-scale heterogeneous graph.<n>We first design a structure-aware pre-training task, which aims to capture structural properties in heterogeneous graphs.<n>Then, we design a semantic-aware pre-training task to tackle the semantic mismatch.
- Score: 6.35582056899733
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In recent years, graph neural networks (GNNs) have facilitated the development of graph data mining. However, training GNNs requires sufficient labeled task-specific data, which is expensive and sometimes unavailable. To be less dependent on labeled data, recent studies propose to pre-train GNNs in a self-supervised manner and then apply the pre-trained GNNs to downstream tasks with limited labeled data. However, most existing methods are designed solely for homogeneous graphs (real-world graphs are mostly heterogeneous) and do not consider semantic mismatch (the semantic difference between the original data and the ideal data containing more transferable semantic information). In this paper, we propose an effective framework to pre-train GNNs on the large-scale heterogeneous graph. We first design a structure-aware pre-training task, which aims to capture structural properties in heterogeneous graphs. Then, we design a semantic-aware pre-training task to tackle the mismatch. Specifically, we construct a perturbation subspace composed of semantic neighbors to help deal with the semantic mismatch. Semantic neighbors make the model focus more on the general knowledge in the semantic space, which in turn assists the model in learning knowledge with better transferability. Finally, extensive experiments are conducted on real-world large-scale heterogeneous graphs to demonstrate the superiority of the proposed method over state-of-the-art baselines. Code available at https://github.com/sunshy-1/PHE.
Related papers
- GraphTOP: Graph Topology-Oriented Prompting for Graph Neural Networks [66.07512871031163]
"Pre-training, adaptation" scheme pre-trains powerful Graph Neural Networks (GNNs) over unlabeled graph data.<n>In the adaptation phase, graph prompting modifies input graph data with learnable prompts while keeping pre-trained GNN models frozen.<n>We propose the first **Graph** **T**opology-**O**riented **P**rompting (GraphTOP) framework to effectively adapt pre-trained GNN models for downstream tasks.
arXiv Detail & Related papers (2025-10-25T22:50:12Z) - GraphLoRA: Structure-Aware Contrastive Low-Rank Adaptation for Cross-Graph Transfer Learning [17.85404473268992]
Graph Neural Networks (GNNs) have demonstrated remarkable proficiency in handling a range of graph analytical tasks.<n>Despite their versatility, GNNs face significant challenges in transferability, limiting their utility in real-world applications.<n>We propose GraphLoRA, an effective and parameter-efficient method for transferring well-trained GNNs to diverse graph domains.
arXiv Detail & Related papers (2024-09-25T06:57:42Z) - Two Heads Are Better Than One: Boosting Graph Sparse Training via
Semantic and Topological Awareness [80.87683145376305]
Graph Neural Networks (GNNs) excel in various graph learning tasks but face computational challenges when applied to large-scale graphs.
We propose Graph Sparse Training ( GST), which dynamically manipulates sparsity at the data level.
GST produces a sparse graph with maximum topological integrity and no performance degradation.
arXiv Detail & Related papers (2024-02-02T09:10:35Z) - HGPROMPT: Bridging Homogeneous and Heterogeneous Graphs for Few-shot Prompt Learning [16.587427365950838]
We propose HGPROMPT, a novel pre-training and prompting framework to unify not only pre-training and downstream tasks but also homogeneous and heterogeneous graphs.
We thoroughly evaluate and analyze HGPROMPT through extensive experiments on three public datasets.
arXiv Detail & Related papers (2023-12-04T13:20:15Z) - Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs.
Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors.
We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z) - Semantic Graph Neural Network with Multi-measure Learning for
Semi-supervised Classification [5.000404730573809]
Graph Neural Networks (GNNs) have attracted increasing attention in recent years.
Recent studies have shown that GNNs are vulnerable to the complex underlying structure of the graph.
We propose a novel framework for semi-supervised classification.
arXiv Detail & Related papers (2022-12-04T06:17:11Z) - MentorGNN: Deriving Curriculum for Pre-Training GNNs [61.97574489259085]
We propose an end-to-end model named MentorGNN that aims to supervise the pre-training process of GNNs across graphs.
We shed new light on the problem of domain adaption on relational data (i.e., graphs) by deriving a natural and interpretable upper bound on the generalization error of the pre-trained GNNs.
arXiv Detail & Related papers (2022-08-21T15:12:08Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - GPT-GNN: Generative Pre-Training of Graph Neural Networks [93.35945182085948]
Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data.
We present the GPT-GNN framework to initialize GNNs by generative pre-training.
We show that GPT-GNN significantly outperforms state-of-the-art GNN models without pre-training by up to 9.1% across various downstream tasks.
arXiv Detail & Related papers (2020-06-27T20:12:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.