TGL: A General Framework for Temporal GNN Training on Billion-Scale
Graphs
- URL: http://arxiv.org/abs/2203.14883v1
- Date: Mon, 28 Mar 2022 16:41:18 GMT
- Title: TGL: A General Framework for Temporal GNN Training on Billion-Scale
Graphs
- Authors: Hongkuan Zhou, Da Zheng, Israt Nisa, Vasileios Ioannidis, Xiang Song,
George Karypis
- Abstract summary: We propose TGL, a unified framework for large-scale offline Temporal Graph Neural Network training.
TGL comprises five main components, a temporal sampler, a mailbox, a node memory module, a memory updater, and a message passing engine.
We introduce two large-scale real-world datasets with 0.2 and 1.3 billion temporal edges.
- Score: 17.264420882897017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many real world graphs contain time domain information. Temporal Graph Neural
Networks capture temporal information as well as structural and contextual
information in the generated dynamic node embeddings. Researchers have shown
that these embeddings achieve state-of-the-art performance in many different
tasks. In this work, we propose TGL, a unified framework for large-scale
offline Temporal Graph Neural Network training where users can compose various
Temporal Graph Neural Networks with simple configuration files. TGL comprises
five main components, a temporal sampler, a mailbox, a node memory module, a
memory updater, and a message passing engine. We design a Temporal-CSR data
structure and a parallel sampler to efficiently sample temporal neighbors to
formtraining mini-batches. We propose a novel random chunk scheduling technique
that mitigates the problem of obsolete node memory when training with a large
batch size. To address the limitations of current TGNNs only being evaluated on
small-scale datasets, we introduce two large-scale real-world datasets with 0.2
and 1.3 billion temporal edges. We evaluate the performance of TGL on four
small-scale datasets with a single GPU and the two large datasets with multiple
GPUs for both link prediction and node classification tasks. We compare TGL
with the open-sourced code of five methods and show that TGL achieves similar
or better accuracy with an average of 13x speedup. Our temporal parallel
sampler achieves an average of 173x speedup on a multi-core CPU compared with
the baselines. On a 4-GPU machine, TGL can train one epoch of more than one
billion temporal edges within 1-10 hours. To the best of our knowledge, this is
the first work that proposes a general framework for large-scale Temporal Graph
Neural Networks training on multiple GPUs.
Related papers
- FastGL: A GPU-Efficient Framework for Accelerating Sampling-Based GNN Training at Large Scale [29.272368697268433]
Graph Neural Networks (GNNs) have shown great superiority on non-Euclidean graph data.
We propose FastGL, a GPU-efficient Framework for accelerating sampling-based training of GNN at Large scale.
FastGL can achieve an average speedup of 11.8x, 2.2x and 1.5x over the state-of-the-art frameworks PyG, DGL, and GNNLab, respectively.
arXiv Detail & Related papers (2024-09-23T11:45:47Z) - Graph Transformers for Large Graphs [57.19338459218758]
This work advances representation learning on single large-scale graphs with a focus on identifying model characteristics and critical design constraints.
A key innovation of this work lies in the creation of a fast neighborhood sampling technique coupled with a local attention mechanism.
We report a 3x speedup and 16.8% performance gain on ogbn-products and snap-patents, while we also scale LargeGT on ogbn-100M with a 5.9% performance improvement.
arXiv Detail & Related papers (2023-12-18T11:19:23Z) - Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training.
We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z) - DistTGL: Distributed Memory-Based Temporal Graph Neural Network Training [18.52206409432894]
DistTGL is an efficient and scalable solution to train memory-based TGNNs on distributed GPU clusters.
In experiments, DistTGL achieves near-linear convergence speedup, outperforming state-of-the-art single-machine method by 14.5% in accuracy and 10.17x in training throughput.
arXiv Detail & Related papers (2023-07-14T22:52:27Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and
Preprocessing [0.0]
Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data.
Existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs.
This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas.
arXiv Detail & Related papers (2021-12-16T00:37:37Z) - Accelerating Training and Inference of Graph Neural Networks with Fast
Sampling and Pipelining [58.10436813430554]
Mini-batch training of graph neural networks (GNNs) requires a lot of computation and data movement.
We argue in favor of performing mini-batch training with neighborhood sampling in a distributed multi-GPU environment.
We present a sequence of improvements to mitigate these bottlenecks, including a performance-engineered neighborhood sampler.
We also conduct an empirical analysis that supports the use of sampling for inference, showing that test accuracies are not materially compromised.
arXiv Detail & Related papers (2021-10-16T02:41:35Z) - DistGNN: Scalable Distributed Training for Large-Scale Graph Neural
Networks [58.48833325238537]
Full-batch training on Graph Neural Networks (GNN) to learn the structure of large graphs is a critical problem that needs to scale to hundreds of compute nodes to be feasible.
In this paper, we presentGNN that optimize the well-known Deep Graph Library (DGL) for full-batch training on CPU clusters.
Our results on four common GNN benchmark datasets show up to 3.7x speed-up using a single CPU socket and up to 97x speed-up using 128 CPU sockets.
arXiv Detail & Related papers (2021-04-14T08:46:35Z) - DistDGL: Distributed Graph Neural Network Training for Billion-Scale
Graphs [22.63888380481248]
DistDGL is a system for training GNNs in a mini-batch fashion on a cluster of machines.
It is based on the Deep Graph Library (DGL), a popular GNN development framework.
Our results show that DistDGL achieves linear speedup without compromising model accuracy.
arXiv Detail & Related papers (2020-10-11T20:22:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.