GCNear: A Hybrid Architecture for Efficient GCN Training with
Near-Memory Processing
- URL: http://arxiv.org/abs/2111.00680v1
- Date: Mon, 1 Nov 2021 03:47:07 GMT
- Title: GCNear: A Hybrid Architecture for Efficient GCN Training with
Near-Memory Processing
- Authors: Zhe Zhou and Cong Li and Xuechao Wei and Guangyu Sun
- Abstract summary: Graph Convolutional Networks (GCNs) have become state-of-the-art algorithms for analyzing non-euclidean graph data.
It is challenging to realize efficient GCN training, especially on large graphs.
This paper presents GCNear, a hybrid architecture to tackle these challenges.
- Score: 8.130391367247793
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, Graph Convolutional Networks (GCNs) have become state-of-the-art
algorithms for analyzing non-euclidean graph data. However, it is challenging
to realize efficient GCN training, especially on large graphs. The reasons are
many-folded: 1) GCN training incurs a substantial memory footprint. Full-batch
training on large graphs even requires hundreds to thousands of gigabytes of
memory to buffer the intermediate data for back-propagation. 2) GCN training
involves both memory-intensive data reduction and computation-intensive
features/gradients update operations. Such a heterogeneous nature challenges
current CPU/GPU platforms. 3) The irregularity of graphs and the complex
training dataflow jointly increase the difficulty of improving a GCN training
system's efficiency.
This paper presents GCNear, a hybrid architecture to tackle these challenges.
Specifically, GCNear adopts a DIMM-based memory system to provide easy-to-scale
memory capacity. To match the heterogeneous nature, we categorize GCN training
operations as memory-intensive Reduce and computation-intensive Update
operations. We then offload Reduce operations to on-DIMM NMEs, making full use
of the high aggregated local bandwidth. We adopt a CAE with sufficient
computation capacity to process Update operations. We further propose several
optimization strategies to deal with the irregularity of GCN tasks and improve
GCNear's performance. We also propose a Multi-GCNear system to evaluate the
scalability of GCNear.
Related papers
- Efficient Message Passing Architecture for GCN Training on HBM-based FPGAs with Orthogonal Topology On-Chip Networks [0.0]
Graph Convolutional Networks (GCNs) are state-of-the-art deep learning models for representation learning on graphs.
We propose a message-passing architecture that leverages NUMA-based memory access properties.
We also re-engineered the backpropagation algorithm specific to GCNs within our proposed accelerator.
arXiv Detail & Related papers (2024-11-06T12:00:51Z) - Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition [56.26113670151363]
Graph condensation is a data-centric solution to replace the large graph with a small yet informative condensed graph.
Existing GC methods suffer from intricate optimization processes, necessitating excessive computing resources.
We propose a training-free GC framework termed Class-partitioned Graph Condensation (CGC)
CGC achieves state-of-the-art performance with a more efficient condensation process.
arXiv Detail & Related papers (2024-05-22T14:57:09Z) - Cached Operator Reordering: A Unified View for Fast GNN Training [24.917363701638607]
Graph Neural Networks (GNNs) are a powerful tool for handling structured graph data and addressing tasks such as node classification, graph classification, and clustering.
However, the sparse nature of GNN computation poses new challenges for performance optimization compared to traditional deep neural networks.
We address these challenges by providing a unified view of GNN computation, I/O, and memory.
arXiv Detail & Related papers (2023-08-23T12:27:55Z) - Scalable Graph Convolutional Network Training on Distributed-Memory
Systems [5.169989177779801]
Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs.
Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges.
We propose a highly parallel training algorithm that scales to large processor counts.
arXiv Detail & Related papers (2022-12-09T17:51:13Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks
with Boundary Node Sampling [25.32242812045678]
We propose a simple yet effective method dubbed BNS-GCN that adopts random Boundary-Node-Sampling to enable efficient and scalable distributed GCN training.
Experiments and ablation studies consistently validate the effectiveness of BNS-GCN, boosting the throughput by up to 16.2x and reducing the memory usage by up to 58%, while maintaining a full-graph accuracy.
arXiv Detail & Related papers (2022-03-21T13:44:37Z) - Towards Efficient Graph Convolutional Networks for Point Cloud Handling [181.59146413326056]
We aim at improving the computational efficiency of graph convolutional networks (GCNs) for learning on point clouds.
A series of experiments show that optimized networks have reduced computational complexity, decreased memory consumption, and accelerated inference speed.
arXiv Detail & Related papers (2021-04-12T17:59:16Z) - DeeperGCN: All You Need to Train Deeper GCNs [66.64739331859226]
Graph Convolutional Networks (GCNs) have been drawing significant attention with the power of representation learning on graphs.
Unlike Convolutional Neural Networks (CNNs), which are able to take advantage of stacking very deep layers, GCNs suffer from vanishing gradient, over-smoothing and over-fitting issues when going deeper.
This paper proposes DeeperGCN that is capable of successfully and reliably training very deep GCNs.
arXiv Detail & Related papers (2020-06-13T23:00:22Z) - L$^2$-GCN: Layer-Wise and Learned Efficient Training of Graph
Convolutional Networks [118.37805042816784]
Graph convolution networks (GCN) are increasingly popular in many applications, yet remain notoriously hard to train over large graph datasets.
We propose a novel efficient layer-wise training framework for GCN (L-GCN), that disentangles feature aggregation and feature transformation during training.
Experiments show that L-GCN is faster than state-of-the-arts by at least an order of magnitude, with a consistent of memory usage not dependent on dataset size.
arXiv Detail & Related papers (2020-03-30T16:37:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.