Distributed Graph Neural Network Training: A Survey
- URL: http://arxiv.org/abs/2211.00216v2
- Date: Fri, 25 Aug 2023 07:26:35 GMT
- Title: Distributed Graph Neural Network Training: A Survey
- Authors: Yingxia Shao, Hongzheng Li, Xizhi Gu, Hongbo Yin, Yawen Li, Xupeng
Miao, Wentao Zhang, Bin Cui, Lei Chen
- Abstract summary: Graph neural networks (GNNs) are a type of deep learning models that are trained on graphs and have been successfully applied in various domains.
Despite the effectiveness of GNNs, it is still challenging for GNNs to efficiently scale to large graphs.
As a remedy, distributed computing becomes a promising solution of training large-scale GNNs.
- Score: 51.77035975191926
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph neural networks (GNNs) are a type of deep learning models that are
trained on graphs and have been successfully applied in various domains.
Despite the effectiveness of GNNs, it is still challenging for GNNs to
efficiently scale to large graphs. As a remedy, distributed computing becomes a
promising solution of training large-scale GNNs, since it is able to provide
abundant computing resources. However, the dependency of graph structure
increases the difficulty of achieving high-efficiency distributed GNN training,
which suffers from the massive communication and workload imbalance. In recent
years, many efforts have been made on distributed GNN training, and an array of
training algorithms and systems have been proposed. Yet, there is a lack of
systematic review on the optimization techniques for the distributed execution
of GNN training. In this survey, we analyze three major challenges in
distributed GNN training that are massive feature communication, the loss of
model accuracy and workload imbalance. Then we introduce a new taxonomy for the
optimization techniques in distributed GNN training that address the above
challenges. The new taxonomy classifies existing techniques into four
categories that are GNN data partition, GNN batch generation, GNN execution
model, and GNN communication protocol. We carefully discuss the techniques in
each category. In the end, we summarize existing distributed GNN systems for
multi-GPUs, GPU-clusters and CPU-clusters, respectively, and give a discussion
about the future direction on distributed GNN training.
Related papers
- Stealing Training Graphs from Graph Neural Networks [54.52392250297907]
Graph Neural Networks (GNNs) have shown promising results in modeling graphs in various tasks.
As neural networks can memorize the training samples, the model parameters of GNNs have a high risk of leaking private training data.
We investigate a novel problem of stealing graphs from trained GNNs.
arXiv Detail & Related papers (2024-11-17T23:15:36Z) - CATGNN: Cost-Efficient and Scalable Distributed Training for Graph Neural Networks [7.321893519281194]
Existing distributed systems load the entire graph in memory for graph partitioning.
We propose CATGNN, a cost-efficient and scalable distributed GNN training system.
We also propose a novel streaming partitioning algorithm named SPRING for distributed GNN training.
arXiv Detail & Related papers (2024-04-02T20:55:39Z) - Graph Ladling: Shockingly Simple Parallel GNN Training without
Intermediate Communication [100.51884192970499]
GNNs are a powerful family of neural networks for learning over graphs.
scaling GNNs either by deepening or widening suffers from prevalent issues of unhealthy gradients, over-smoothening, information squashing.
We propose not to deepen or widen current GNNs, but instead present a data-centric perspective of model soups tailored for GNNs.
arXiv Detail & Related papers (2023-06-18T03:33:46Z) - GNN-Ensemble: Towards Random Decision Graph Neural Networks [3.7620848582312405]
Graph Neural Networks (GNNs) have enjoyed wide spread applications in graph-structured data.
GNNs are required to learn latent patterns from a limited amount of training data to perform inferences on a vast amount of test data.
In this paper, we push one step forward on the ensemble learning of GNNs with improved accuracy, robustness, and adversarial attacks.
arXiv Detail & Related papers (2023-03-20T18:24:01Z) - A Comprehensive Survey on Distributed Training of Graph Neural Networks [59.785830738482474]
Graph neural networks (GNNs) have been demonstrated to be a powerful algorithmic model in broad application fields.
To scale GNN training up for large-scale and ever-growing graphs, the most promising solution is distributed training.
The volume of related research on distributed GNN training is exceptionally vast, accompanied by an extraordinarily rapid pace of publication.
arXiv Detail & Related papers (2022-11-10T06:22:12Z) - Characterizing and Understanding Distributed GNN Training on GPUs [2.306379679349986]
Graph neural network (GNN) has been demonstrated to be a powerful model in many domains for its effectiveness in learning over graphs.
To scale GNN training for large graphs, a widely adopted approach is distributed training which accelerates training using multiple computing nodes.
arXiv Detail & Related papers (2022-04-18T03:47:28Z) - Shift-Robust GNNs: Overcoming the Limitations of Localized Graph
Training data [52.771780951404565]
Shift-Robust GNN (SR-GNN) is designed to account for distributional differences between biased training data and the graph's true inference distribution.
We show that SR-GNN outperforms other GNN baselines by accuracy, eliminating at least (40%) of the negative effects introduced by biased training data.
arXiv Detail & Related papers (2021-08-02T18:00:38Z) - Optimization of Graph Neural Networks: Implicit Acceleration by Skip
Connections and More Depth [57.10183643449905]
Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization.
We study the dynamics of GNNs by studying deep skip optimization.
Our results provide first theoretical support for the success of GNNs.
arXiv Detail & Related papers (2021-05-10T17:59:01Z) - Computing Graph Neural Networks: A Survey from Algorithms to
Accelerators [2.491032752533246]
Graph Neural Networks (GNNs) have exploded onto the machine learning scene in recent years owing to their capability to model and learn from graph-structured data.
This paper aims to make two main contributions: a review of the field of GNNs is presented from the perspective of computing.
An in-depth analysis of current software and hardware acceleration schemes is provided.
arXiv Detail & Related papers (2020-09-30T22:29:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.