Distributed Training of Graph Convolutional Networks using Subgraph
Approximation
- URL: http://arxiv.org/abs/2012.04930v1
- Date: Wed, 9 Dec 2020 09:23:49 GMT
- Title: Distributed Training of Graph Convolutional Networks using Subgraph
Approximation
- Authors: Alexandra Angerd, Keshav Balasubramanian, Murali Annavaram
- Abstract summary: We propose a training strategy that mitigates the lost information across multiple partitions of a graph through a subgraph approximation scheme.
The subgraph approximation approach helps the distributed training system converge at single-machine accuracy.
- Score: 72.89940126490715
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Modern machine learning techniques are successfully being adapted to data
modeled as graphs. However, many real-world graphs are typically very large and
do not fit in memory, often making the problem of training machine learning
models on them intractable. Distributed training has been successfully employed
to alleviate memory problems and speed up training in machine learning domains
in which the input data is assumed to be independently identical distributed
(i.i.d). However, distributing the training of non i.i.d data such as graphs
that are used as training inputs in Graph Convolutional Networks (GCNs) causes
accuracy problems since information is lost at the graph partitioning
boundaries.
In this paper, we propose a training strategy that mitigates the lost
information across multiple partitions of a graph through a subgraph
approximation scheme. Our proposed approach augments each sub-graph with a
small amount of edge and vertex information that is approximated from all other
sub-graphs. The subgraph approximation approach helps the distributed training
system converge at single-machine accuracy, while keeping the memory footprint
low and minimizing synchronization overhead between the machines.
Related papers
- GraphScale: A Framework to Enable Machine Learning over Billion-node Graphs [6.418397511692011]
We propose a unified framework for both supervised and unsupervised learning to store and process large graph data distributedly.
The key insight in our design is the separation of workers who store data and those who perform the training.
Our experiments show that GraphScale outperforms state-of-the-art methods for distributed training of both GNNs and node embeddings.
arXiv Detail & Related papers (2024-07-22T08:09:36Z) - Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding [51.75091298017941]
This paper proposes a novel Deep Manifold (Variational) Graph Auto-Encoder (DMVGAE/DMGAE) for attributed graph data.
The proposed method surpasses state-of-the-art baseline algorithms by a significant margin on different downstream tasks across popular datasets.
arXiv Detail & Related papers (2024-01-12T17:57:07Z) - Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training.
We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z) - Scalable Graph Convolutional Network Training on Distributed-Memory
Systems [5.169989177779801]
Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs.
Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges.
We propose a highly parallel training algorithm that scales to large processor counts.
arXiv Detail & Related papers (2022-12-09T17:51:13Z) - Scaling R-GCN Training with Graph Summarization [71.06855946732296]
Training of Relation Graph Convolutional Networks (R-GCN) does not scale well with the size of the graph.
In this work, we experiment with the use of graph summarization techniques to compress the graph.
We obtain reasonable results on the AIFB, MUTAG and AM datasets.
arXiv Detail & Related papers (2022-03-05T00:28:43Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - Distributed Graph Learning with Smooth Data Priors [61.405131495287755]
We propose a novel distributed graph learning algorithm, which permits to infer a graph from signal observations on the nodes.
Our results show that the distributed approach has a lower communication cost than a centralised algorithm without compromising the accuracy in the inferred graph.
arXiv Detail & Related papers (2021-12-11T00:52:02Z) - Learning Massive Graph Embeddings on a Single Machine [11.949017733445624]
A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph.
Current systems for learning the embeddings of large-scale graphs are bottlenecked by data movement.
We propose Gaius, a system for efficient training of graph embeddings.
arXiv Detail & Related papers (2021-01-20T23:17:31Z) - Sub-graph Contrast for Scalable Self-Supervised Graph Representation
Learning [21.0019144298605]
Existing graph neural networks fed with the complete graph data are not scalable due to limited computation and memory costs.
textscSubg-Con is proposed by utilizing the strong correlation between central nodes and their sampled subgraphs to capture regional structure information.
Compared with existing graph representation learning approaches, textscSubg-Con has prominent performance advantages in weaker supervision requirements, model learning scalability, and parallelization.
arXiv Detail & Related papers (2020-09-22T01:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.