Accelerating Graph Sampling for Graph Machine Learning using GPUs
- URL: http://arxiv.org/abs/2009.06693v4
- Date: Tue, 11 May 2021 00:57:02 GMT
- Title: Accelerating Graph Sampling for Graph Machine Learning using GPUs
- Authors: Abhinav Jangda, Sandeep Polisetty, Arjun Guha, Marco Serafini
- Abstract summary: NextDoor is a system designed to perform graph sampling on GPU resources.
NextDoor employs a new approach to graph sampling that we call transit-parallelism.
We implement several graph sampling applications, and show that NextDoor runs them orders of magnitude faster than existing systems.
- Score: 2.9383911860380127
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Representation learning algorithms automatically learn the features of data.
Several representation learning algorithms for graph data, such as DeepWalk,
node2vec, and GraphSAGE, sample the graph to produce mini-batches that are
suitable for training a DNN. However, sampling time can be a significant
fraction of training time, and existing systems do not efficiently parallelize
sampling.
Sampling is an embarrassingly parallel problem and may appear to lend itself
to GPU acceleration, but the irregularity of graphs makes it hard to use GPU
resources effectively. This paper presents NextDoor, a system designed to
effectively perform graph sampling on GPUs. NextDoor employs a new approach to
graph sampling that we call transit-parallelism, which allows load balancing
and caching of edges. NextDoor provides end-users with a high-level abstraction
for writing a variety of graph sampling algorithms. We implement several graph
sampling applications, and show that NextDoor runs them orders of magnitude
faster than existing systems.
Related papers
- GNNFlow: A Distributed Framework for Continuous Temporal GNN Learning on
Dynamic Graphs [11.302970701867844]
We introduce GNNFlow, a distributed framework for efficient continuous temporal graph representation learning.
GNNFlow supports distributed training across multiple machines with static scheduling to ensure load balance.
Our experimental results show that GNNFlow provides up to 21.1x faster continuous learning than existing systems.
arXiv Detail & Related papers (2023-11-29T07:30:32Z) - Distributed Matrix-Based Sampling for Graph Neural Network Training [0.0]
We propose a matrix-based bulk sampling approach that expresses sampling as a sparse matrix multiplication (SpGEMM) and samples multiple minibatches at once.
When the input graph topology does not fit on a single device, our method distributes the graph and use communication-avoiding SpGEMM algorithms to scale GNN minibatch sampling.
In addition to new methods for sampling, we introduce a pipeline that uses our matrix-based bulk sampling approach to provide end-to-end training results.
arXiv Detail & Related papers (2023-11-06T06:40:43Z) - SimTeG: A Frustratingly Simple Approach Improves Textual Graph Learning [131.04781590452308]
We present SimTeG, a frustratingly Simple approach for Textual Graph learning.
We first perform supervised parameter-efficient fine-tuning (PEFT) on a pre-trained LM on the downstream task.
We then generate node embeddings using the last hidden states of finetuned LM.
arXiv Detail & Related papers (2023-08-03T07:00:04Z) - Quiver: Supporting GPUs for Low-Latency, High-Throughput GNN Serving
with Workload Awareness [4.8412870364335925]
Quiver is a distributed GPU-based GNN serving system with low-latency and high- throughput.
We show that Quiver achieves up to 35 times lower latency with an 8 times higher throughput compared to state-of-the-art GNN approaches.
arXiv Detail & Related papers (2023-05-18T10:34:23Z) - Scaling R-GCN Training with Graph Summarization [71.06855946732296]
Training of Relation Graph Convolutional Networks (R-GCN) does not scale well with the size of the graph.
In this work, we experiment with the use of graph summarization techniques to compress the graph.
We obtain reasonable results on the AIFB, MUTAG and AM datasets.
arXiv Detail & Related papers (2022-03-05T00:28:43Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and
Preprocessing [0.0]
Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data.
Existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs.
This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas.
arXiv Detail & Related papers (2021-12-16T00:37:37Z) - Accelerating Training and Inference of Graph Neural Networks with Fast
Sampling and Pipelining [58.10436813430554]
Mini-batch training of graph neural networks (GNNs) requires a lot of computation and data movement.
We argue in favor of performing mini-batch training with neighborhood sampling in a distributed multi-GPU environment.
We present a sequence of improvements to mitigate these bottlenecks, including a performance-engineered neighborhood sampler.
We also conduct an empirical analysis that supports the use of sampling for inference, showing that test accuracies are not materially compromised.
arXiv Detail & Related papers (2021-10-16T02:41:35Z) - Accurate, Efficient and Scalable Training of Graph Neural Networks [9.569918335816963]
Graph Neural Networks (GNNs) are powerful deep learning models to generate node embeddings on graphs.
It is still challenging to perform training in an efficient and scalable way.
We propose a novel parallel training framework that reduces training workload by orders of magnitude compared with state-of-the-art minibatch methods.
arXiv Detail & Related papers (2020-10-05T22:06:23Z) - Graph Pooling with Node Proximity for Hierarchical Representation
Learning [80.62181998314547]
We propose a novel graph pooling strategy that leverages node proximity to improve the hierarchical representation learning of graph data with their multi-hop topology.
Results show that the proposed graph pooling strategy is able to achieve state-of-the-art performance on a collection of public graph classification benchmark datasets.
arXiv Detail & Related papers (2020-06-19T13:09:44Z) - Graph Ordering: Towards the Optimal by Learning [69.72656588714155]
Graph representation learning has achieved a remarkable success in many graph-based applications, such as node classification, prediction, and community detection.
However, for some kind of graph applications, such as graph compression and edge partition, it is very hard to reduce them to some graph representation learning tasks.
In this paper, we propose to attack the graph ordering problem behind such applications by a novel learning approach.
arXiv Detail & Related papers (2020-01-18T09:14:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.