Related papers: Slicing Input Features to Accelerate Deep Learning: A Case Study with Graph Neural Networks

Slicing Input Features to Accelerate Deep Learning: A Case Study with Graph Neural Networks

URL: http://arxiv.org/abs/2408.11500v1
Date: Wed, 21 Aug 2024 10:18:41 GMT
Title: Slicing Input Features to Accelerate Deep Learning: A Case Study with Graph Neural Networks
Authors: Zhengjia Xu, Dingyang Lyu, Jinghui Zhang,
Abstract summary: This paper introduces SliceGCN, a feature-sliced distributed large-scale graph learning method. It aims to avoid accuracy loss typically associated with mini-batch training and to reduce inter- GPU communication. Experiments were conducted on six node classification datasets, yielding some interesting analytical results.
Score: 0.24578723416255746
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As graphs grow larger, full-batch GNN training becomes hard for single GPU memory. Therefore, to enhance the scalability of GNN training, some studies have proposed sampling-based mini-batch training and distributed graph learning. However, these methods still have drawbacks, such as performance degradation and heavy communication. This paper introduces SliceGCN, a feature-sliced distributed large-scale graph learning method. SliceGCN slices the node features, with each computing device, i.e., GPU, handling partial features. After each GPU processes its share, partial representations are obtained and concatenated to form complete representations, enabling a single GPU's memory to handle the entire graph structure. This aims to avoid the accuracy loss typically associated with mini-batch training (due to incomplete graph structures) and to reduce inter-GPU communication during message passing (the forward propagation process of GNNs). To study and mitigate potential accuracy reductions due to slicing features, this paper proposes feature fusion and slice encoding. Experiments were conducted on six node classification datasets, yielding some interesting analytical results. These results indicate that while SliceGCN does not enhance efficiency on smaller datasets, it does improve efficiency on larger datasets. Additionally, we found that SliceGCN and its variants have better convergence, feature fusion and slice encoding can make training more stable, reduce accuracy fluctuations, and this study also discovered that the design of SliceGCN has a potentially parameter-efficient nature.

Related papers

Distributed Graph Neural Network Inference With Just-In-Time Compilation For Industry-Scale Graphs [6.924892368183222]
Graph neural networks (GNNs) have delivered remarkable results in various fields. The rapid increase in the scale of graph data has introduced significant performance bottlenecks for GNN inference. This paper introduces an innovative processing paradgim for distributed graph learning that abstracts GNNs with a new set of programming interfaces.
arXiv Detail & Related papers (2025-03-08T13:26:59Z)
Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs. Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors. We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z)
Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training. We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z)
DistTGL: Distributed Memory-Based Temporal Graph Neural Network Training [18.52206409432894]
DistTGL is an efficient and scalable solution to train memory-based TGNNs on distributed GPU clusters. In experiments, DistTGL achieves near-linear convergence speedup, outperforming state-of-the-art single-machine method by 14.5% in accuracy and 10.17x in training throughput.
arXiv Detail & Related papers (2023-07-14T22:52:27Z)
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes. The efficient computation is enabled by a kernerlized Gumbel-Softmax operator. Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z)
Scalable Graph Convolutional Network Training on Distributed-Memory Systems [5.169989177779801]
Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges. We propose a highly parallel training algorithm that scales to large processor counts.
arXiv Detail & Related papers (2022-12-09T17:51:13Z)
A Comprehensive Study on Large-Scale Graph Training: Benchmarking and Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs) We present a new ensembling training manner, named EnGCN, to address the existing issues. Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z)
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z)
BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing [0.0]
Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data. Existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs. This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas.
arXiv Detail & Related papers (2021-12-16T00:37:37Z)
Towards Efficient Graph Convolutional Networks for Point Cloud Handling [181.59146413326056]
We aim at improving the computational efficiency of graph convolutional networks (GCNs) for learning on point clouds. A series of experiments show that optimized networks have reduced computational complexity, decreased memory consumption, and accelerated inference speed.
arXiv Detail & Related papers (2021-04-12T17:59:16Z)
GIST: Distributed Training for Large-Scale Graph Convolutional Networks [18.964079367668262]
GIST is a hybrid layer and graph sampling method, which disjointly partitions the global model into several, smaller sub-GCNs. This distributed framework improves model performance and significantly decreases wall-clock training time. GIST seeks to enable large-scale GCN experimentation with the goal of bridging the existing gap in scale between graph machine learning and deep learning.
arXiv Detail & Related papers (2021-02-20T19:25:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.