Related papers: Scalable Neural Network Training over Distributed Graphs

Scalable Neural Network Training over Distributed Graphs

URL: http://arxiv.org/abs/2302.13053v3
Date: Sun, 11 Feb 2024 08:52:32 GMT
Title: Scalable Neural Network Training over Distributed Graphs
Authors: Aashish Kolluri, Sarthak Choudhary, Bryan Hooi, Prateek Saxena
Abstract summary: Realworld graph data must often be stored across many machines just because capacity constraints. Network communication is costly and becomes main bottleneck to train GNNs. First framework that can be used to train GNNs at all network decentralization levels.
Score: 45.151244961817454
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Graph neural networks (GNNs) fuel diverse machine learning tasks involving graph-structured data, ranging from predicting protein structures to serving personalized recommendations. Real-world graph data must often be stored distributed across many machines not just because of capacity constraints, but because of compliance with data residency or privacy laws. In such setups, network communication is costly and becomes the main bottleneck to train GNNs. Optimizations for distributed GNN training have targeted data-level improvements so far -- via caching, network-aware partitioning, and sub-sampling -- that work for data center-like setups where graph data is accessible to a single entity and data transfer costs are ignored. We present RETEXO, the first framework which eliminates the severe communication bottleneck in distributed GNN training while respecting any given data partitioning configuration. The key is a new training procedure, lazy message passing, that reorders the sequence of training GNN elements. RETEXO achieves 1-2 orders of magnitude reduction in network data costs compared to standard GNN training, while retaining accuracy. RETEXO scales gracefully with increasing decentralization and decreasing bandwidth. It is the first framework that can be used to train GNNs at all network decentralization levels -- including centralized data-center networks, wide area networks, proximity networks, and edge networks.

Related papers

MassiveGNN: Efficient Training via Prefetching for Massively Connected Distributed Graphs [11.026326555186333]
This paper develops a parameterized continuous prefetch and eviction scheme on top of the state-of-the-art Amazon DistDGL distributed GNN framework. It demonstrates about 15-40% improvement in end-to-end training performance on the National Energy Research Scientific Computing Center's (NERSC) Perlmutter supercomputer.
arXiv Detail & Related papers (2024-10-30T05:10:38Z)
Distributed Training of Large Graph Neural Networks with Variable Communication Rates [71.7293735221656]
Training Graph Neural Networks (GNNs) on large graphs presents unique challenges due to the large memory and computing requirements. Distributed GNN training, where the graph is partitioned across multiple machines, is a common approach to training GNNs on large graphs. We introduce a variable compression scheme for reducing the communication volume in distributed GNN training without compromising the accuracy of the learned model.
arXiv Detail & Related papers (2024-06-25T14:57:38Z)
Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training. We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z)
ABC: Aggregation before Communication, a Communication Reduction Framework for Distributed Graph Neural Network Training and Effective Partition [0.0]
Graph Neural Networks (GNNs) are neural models tailored for graph-structure data and have shown superior performance in learning representations for graph-structured data. In this paper, we study the communication complexity during distributed GNNs training. We show that the new partition paradigm is particularly ideal in the case of dynamic graphs where it is infeasible to control the edge placement due to the unknown of the graph-changing process.
arXiv Detail & Related papers (2022-12-11T04:54:01Z)
Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data. We present a novel Graph Matching based GNN Pre-Training framework, called GMPT. The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z)
Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks [22.728439336309858]
We propose a communication-efficient distributed GNN training technique named $textLearn Locally, Correct Globally$ (LLCG) LLCG trains a GNN on its local data by ignoring the dependency between nodes among different machines, then sends the locally trained model to the server for periodic model averaging. We rigorously analyze the convergence of distributed methods with periodic model averaging for training GNNs and show that naively applying periodic model averaging but ignoring the dependency between nodes will suffer from an irreducible residual error.
arXiv Detail & Related papers (2021-11-16T03:07:01Z)
SpreadGNN: Serverless Multi-task Federated Learning for Graph Neural Networks [13.965982814292971]
Graph Neural Networks (GNNs) are the first choice methods for graph machine learning problems. Centralizing a massive amount of real-world graph data for GNN training is prohibitive due to user-side privacy concerns. This work proposes SpreadGNN, a novel multi-task federated training framework.
arXiv Detail & Related papers (2021-06-04T22:20:47Z)
A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights. We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z)
An Uncoupled Training Architecture for Large Graph Learning [20.784230322205232]
We present Node2Grids, a flexible uncoupled training framework for embedding graph data into grid-like data. By ranking each node's influence through degree, Node2Grids selects the most influential first-order as well as second-order neighbors with central node fusion information. For further improving the efficiency of downstream tasks, a simple CNN-based neural network is employed to capture the significant information from the mapped grid-like data.
arXiv Detail & Related papers (2020-03-21T11:49:16Z)
Graphs, Convolutions, and Neural Networks: From Graph Filters to Graph Neural Networks [183.97265247061847]
We leverage graph signal processing to characterize the representation space of graph neural networks (GNNs) We discuss the role of graph convolutional filters in GNNs and show that any architecture built with such filters has the fundamental properties of permutation equivariance and stability to changes in the topology. We also study the use of GNNs in recommender systems and learning decentralized controllers for robot swarms.
arXiv Detail & Related papers (2020-03-08T13:02:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.