Related papers: Nimble GNN Embedding with Tensor-Train Decomposition

Nimble GNN Embedding with Tensor-Train Decomposition

URL: http://arxiv.org/abs/2206.10581v1
Date: Tue, 21 Jun 2022 17:57:35 GMT
Title: Nimble GNN Embedding with Tensor-Train Decomposition
Authors: Chunxing Yin, Da Zheng, Israt Nisa, Christos Faloutos, George Karypis, Richard Vuduc
Abstract summary: This paper describes a new method for representing embedding tables of graph neural networks (GNNs) more compactly via tensor-train (TT) decomposition. In some cases, our model without explicit node features on input can even match the accuracy of models that use node features.
Score: 10.726368002799765
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper describes a new method for representing embedding tables of graph neural networks (GNNs) more compactly via tensor-train (TT) decomposition. We consider the scenario where (a) the graph data that lack node features, thereby requiring the learning of embeddings during training; and (b) we wish to exploit GPU platforms, where smaller tables are needed to reduce host-to-GPU communication even for large-memory GPUs. The use of TT enables a compact parameterization of the embedding, rendering it small enough to fit entirely on modern GPUs even for massive graphs. When combined with judicious schemes for initialization and hierarchical graph partitioning, this approach can reduce the size of node embedding vectors by 1,659 times to 81,362 times on large publicly available benchmark datasets, achieving comparable or better accuracy and significant speedups on multi-GPU systems. In some cases, our model without explicit node features on input can even match the accuracy of models that use node features.

Related papers

Slicing Input Features to Accelerate Deep Learning: A Case Study with Graph Neural Networks [0.24578723416255746]
This paper introduces SliceGCN, a feature-sliced distributed large-scale graph learning method. It aims to avoid accuracy loss typically associated with mini-batch training and to reduce inter- GPU communication. Experiments were conducted on six node classification datasets, yielding some interesting analytical results.
arXiv Detail & Related papers (2024-08-21T10:18:41Z)
Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training. We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z)
DistTGL: Distributed Memory-Based Temporal Graph Neural Network Training [18.52206409432894]
DistTGL is an efficient and scalable solution to train memory-based TGNNs on distributed GPU clusters. In experiments, DistTGL achieves near-linear convergence speedup, outperforming state-of-the-art single-machine method by 14.5% in accuracy and 10.17x in training throughput.
arXiv Detail & Related papers (2023-07-14T22:52:27Z)
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes. The efficient computation is enabled by a kernerlized Gumbel-Softmax operator. Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z)
Scalable Graph Convolutional Network Training on Distributed-Memory Systems [5.169989177779801]
Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges. We propose a highly parallel training algorithm that scales to large processor counts.
arXiv Detail & Related papers (2022-12-09T17:51:13Z)
Embedding Compression with Hashing for Efficient Representation Learning in Large-Scale Graph [21.564894767364397]
Graph neural networks (GNNs) are deep learning models designed specifically for graph data. We develop a node embedding compression method where each node is compactly represented with a bit vector instead of a floating-point vector. We show that the proposed node embedding compression method achieves superior performance compared to the alternatives.
arXiv Detail & Related papers (2022-08-11T05:43:39Z)
DOTIN: Dropping Task-Irrelevant Nodes for GNNs [119.17997089267124]
Recent graph learning approaches have introduced the pooling strategy to reduce the size of graphs for learning. We design a new approach called DOTIN (underlineDrunderlineopping underlineTask-underlineIrrelevant underlineNodes) to reduce the size of graphs. Our method speeds up GAT by about 50% on graph-level tasks including graph classification and graph edit distance.
arXiv Detail & Related papers (2022-04-28T12:00:39Z)
Boosting Graph Embedding on a Single GPU [3.093890460224435]
We present GOSH, a GPU-based tool for embedding large-scale graphs with minimum hardware constraints. It employs a novel graph coarsening algorithm to enhance the impact of updates and minimize the work for embedding. It also incorporates a decomposition schema that enables any arbitrarily large graph to be embedded with a single GPU.
arXiv Detail & Related papers (2021-10-19T15:25:04Z)
GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings [51.82434518719011]
GNNAutoScale (GAS) is a framework for scaling arbitrary message-passing GNNs to large graphs. Gas prunes entire sub-trees of the computation graph by utilizing historical embeddings from prior training iterations. Gas reaches state-of-the-art performance on large-scale graphs.
arXiv Detail & Related papers (2021-06-10T09:26:56Z)
VersaGNN: a Versatile accelerator for Graph neural networks [81.1667080640009]
We propose textitVersaGNN, an ultra-efficient, systolic-array-based versatile hardware accelerator. textitVersaGNN achieves on average 3712$times$ speedup with 1301.25$times$ energy reduction on CPU, and 35.4$times$ speedup with 17.66$times$ energy reduction on GPU.
arXiv Detail & Related papers (2021-05-04T04:10:48Z)
DistGNN: Scalable Distributed Training for Large-Scale Graph Neural Networks [58.48833325238537]
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible. In this paper, we presentGNN that optimize the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters. Our results on four common GNN benchmark datasets show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets.
arXiv Detail & Related papers (2021-04-14T08:46:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.