GraphTheta: A Distributed Graph Neural Network Learning System With
Flexible Training Strategy
- URL: http://arxiv.org/abs/2104.10569v1
- Date: Wed, 21 Apr 2021 14:51:33 GMT
- Title: GraphTheta: A Distributed Graph Neural Network Learning System With
Flexible Training Strategy
- Authors: Houyi Li, Yongchao Liu, Yongyong Li, Bin Huang, Peng Zhang, Guowei
Zhang, Xintan Zeng, Kefeng Deng, Wenguang Chen, and Changhua He
- Abstract summary: We present a new distributed graph learning system GraphTheta.
It supports multiple training strategies and enables efficient and scalable learning on big graphs.
This work represents the largest edge-attributed GNN learning task conducted on a billion-scale network in the literature.
- Score: 5.466414428765544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph neural networks (GNNs) have been demonstrated as a powerful tool for
analysing non-Euclidean graph data. However, the lack of efficient distributed
graph learning systems severely hinders applications of GNNs, especially when
graphs are big, of high density or with highly skewed node degree
distributions. In this paper, we present a new distributed graph learning
system GraphTheta, which supports multiple training strategies and enables
efficient and scalable learning on big graphs. GraphTheta implements both
localized and globalized graph convolutions on graphs, where a new graph
learning abstraction NN-TGAR is designed to bridge the gap between graph
processing and graph learning frameworks. A distributed graph engine is
proposed to conduct the stochastic gradient descent optimization with
hybrid-parallel execution. Moreover, we add support for a new cluster-batched
training strategy in addition to the conventional global-batched and
mini-batched ones. We evaluate GraphTheta using a number of network data with
network size ranging from small-, modest- to large-scale. Experimental results
show that GraphTheta scales almost linearly to 1,024 workers and trains an
in-house developed GNN model within 26 hours on Alipay dataset of 1.4 billion
nodes and 4.1 billion attributed edges. Moreover, GraphTheta also obtains
better prediction results than the state-of-the-art GNN methods. To the best of
our knowledge, this work represents the largest edge-attributed GNN learning
task conducted on a billion-scale network in the literature.
Related papers
- Graph Structure Prompt Learning: A Novel Methodology to Improve Performance of Graph Neural Networks [13.655670509818144]
We propose a novel Graph structure Prompt Learning method (GPL) to enhance the training of Graph networks (GNNs)
GPL employs task-independent graph structure losses to encourage GNNs to learn intrinsic graph characteristics while simultaneously solving downstream tasks.
In experiments on eleven real-world datasets, after being trained by neural prediction, GNNs significantly outperform their original performance on node classification, graph classification, and edge tasks.
arXiv Detail & Related papers (2024-07-16T03:59:18Z) - Sketch-GNN: Scalable Graph Neural Networks with Sublinear Training Complexity [30.2972965458946]
Graph Networks (GNNs) are widely applied to graph learning problems such as node classification.
When scaling up the underlying graphs of GNNs to a larger size, we are forced to either train on the complete graph or keep the full graph adjacency and node embeddings in memory.
This paper proposes a sketch-based algorithm whose training time and memory grow sublinearly with respect to graph size.
arXiv Detail & Related papers (2024-06-21T18:22:11Z) - GraphAlign: Pretraining One Graph Neural Network on Multiple Graphs via Feature Alignment [30.56443056293688]
Graph self-supervised learning (SSL) holds considerable promise for mining and learning with graph-structured data.
In this work, we aim to pretrain one graph neural network (GNN) on a varied collection of graphs endowed with rich node features.
We present a general GraphAlign method that can be seamlessly integrated into the existing graph SSL framework.
arXiv Detail & Related papers (2024-06-05T05:22:32Z) - Training Graph Neural Networks on Growing Stochastic Graphs [114.75710379125412]
Graph Neural Networks (GNNs) rely on graph convolutions to exploit meaningful patterns in networked data.
We propose to learn GNNs on very large graphs by leveraging the limit object of a sequence of growing graphs, the graphon.
arXiv Detail & Related papers (2022-10-27T16:00:45Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Increase and Conquer: Training Graph Neural Networks on Growing Graphs [116.03137405192356]
We consider the problem of learning a graphon neural network (WNN) by training GNNs on graphs sampled Bernoulli from the graphon.
Inspired by these results, we propose an algorithm to learn GNNs on large-scale graphs that, starting from a moderate number of nodes, successively increases the size of the graph during training.
arXiv Detail & Related papers (2021-06-07T15:05:59Z) - Co-embedding of Nodes and Edges with Graph Neural Networks [13.020745622327894]
Graph embedding is a way to transform and encode the data structure in high dimensional and non-Euclidean feature space.
CensNet is a general graph embedding framework, which embeds both nodes and edges to a latent feature space.
Our approach achieves or matches the state-of-the-art performance in four graph learning tasks.
arXiv Detail & Related papers (2020-10-25T22:39:31Z) - Graph Contrastive Learning with Augmentations [109.23158429991298]
We propose a graph contrastive learning (GraphCL) framework for learning unsupervised representations of graph data.
We show that our framework can produce graph representations of similar or better generalizability, transferrability, and robustness compared to state-of-the-art methods.
arXiv Detail & Related papers (2020-10-22T20:13:43Z) - Multilevel Graph Matching Networks for Deep Graph Similarity Learning [79.3213351477689]
We propose a multi-level graph matching network (MGMN) framework for computing the graph similarity between any pair of graph-structured objects.
To compensate for the lack of standard benchmark datasets, we have created and collected a set of datasets for both the graph-graph classification and graph-graph regression tasks.
Comprehensive experiments demonstrate that MGMN consistently outperforms state-of-the-art baseline models on both the graph-graph classification and graph-graph regression tasks.
arXiv Detail & Related papers (2020-07-08T19:48:19Z) - Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs.
In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings.
We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.