Related papers: Graph Neural Network Training with Data Tiering

Graph Neural Network Training with Data Tiering

URL: http://arxiv.org/abs/2111.05894v1
Date: Wed, 10 Nov 2021 19:35:10 GMT
Title: Graph Neural Network Training with Data Tiering
Authors: Seung Won Min, Kun Wu, Mert Hidayeto\u{g}lu, Jinjun Xiong, Xiang Song, Wen-mei Hwu
Abstract summary: Graph Neural Networks (GNNs) have shown success in learning from graph-structured data, with applications to fraud detection, recommendation, and knowledge graph reasoning. However, training GNN efficiently is challenging because 1) GPU memory capacity is limited and can be insufficient for large datasets, and 2) the graph-based data structure causes irregular data access patterns. In this work, we provide a method to statistical analyze and identify more frequently accessed data ahead of GNN training.
Score: 16.02267628659034
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph Neural Networks (GNNs) have shown success in learning from graph-structured data, with applications to fraud detection, recommendation, and knowledge graph reasoning. However, training GNN efficiently is challenging because: 1) GPU memory capacity is limited and can be insufficient for large datasets, and 2) the graph-based data structure causes irregular data access patterns. In this work, we provide a method to statistical analyze and identify more frequently accessed data ahead of GNN training. Our data tiering method not only utilizes the structure of input graph, but also an insight gained from actual GNN training process to achieve a higher prediction result. With our data tiering method, we additionally provide a new data placement and access strategy to further minimize the CPU-GPU communication overhead. We also take into account of multi-GPU GNN training as well and we demonstrate the effectiveness of our strategy in a multi-GPU system. The evaluation results show that our work reduces CPU-GPU traffic by 87-95% and improves the training speed of GNN over the existing solutions by 1.6-2.1x on graphs with hundreds of millions of nodes and billions of edges.

Related papers

Graph Unlearning with Efficient Partial Retraining [28.433619085748447]
Graph Neural Networks (GNNs) have achieved remarkable success in various real-world applications. GNNs may be trained on undesirable graph data, which can degrade their performance and reliability. We propose GraphRevoker, a novel graph unlearning framework that better maintains the model utility of unlearnable GNNs.
arXiv Detail & Related papers (2024-03-12T06:22:10Z)
Unlearning Graph Classifiers with Limited Data Resources [39.29148804411811]
Controlled data removal is becoming an important feature of machine learning models for data-sensitive Web applications. It is still largely unknown how to perform efficient machine unlearning of graph neural networks (GNNs) Our main contribution is the first known nonlinear approximate graph unlearning method based on GSTs. Our second contribution is a theoretical analysis of the computational complexity of the proposed unlearning mechanism. Our third contribution are extensive simulation results which show that, compared to complete retraining of GNNs after each removal request, the new GST-based approach offers, on average, a 10.38x speed-up
arXiv Detail & Related papers (2022-11-06T20:46:50Z)
Training Graph Neural Networks on Growing Stochastic Graphs [114.75710379125412]
Graph Neural Networks (GNNs) rely on graph convolutions to exploit meaningful patterns in networked data. We propose to learn GNNs on very large graphs by leveraging the limit object of a sequence of growing graphs, the graphon.
arXiv Detail & Related papers (2022-10-27T16:00:45Z)
SCARA: Scalable Graph Neural Networks with Feature-Oriented Optimization [23.609017952951454]
We propose SCARA, a scalable Graph Neural Network (GNN) with feature-oriented optimization for graph computation. SCARA efficiently computes graph embedding from node features, and further selects and reuses feature results to reduce overhead. It is efficient to process precomputation on the largest available billion-scale GNN dataset Papers100M (111M nodes, 1.6B edges) in 100 seconds.
arXiv Detail & Related papers (2022-07-19T10:32:11Z)
Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data. We present a novel Graph Matching based GNN Pre-Training framework, called GMPT. The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z)
BGL: GPU-Efficient GNN Training by Optimizing Graph Data I/O and Preprocessing [0.0]
Graph neural networks (GNNs) have extended the success of deep neural networks (DNNs) to non-Euclidean graph data. Existing systems are inefficient to train large graphs with billions of nodes and edges with GPUs. This paper proposes BGL, a distributed GNN training system designed to address the bottlenecks with a few key ideas.
arXiv Detail & Related papers (2021-12-16T00:37:37Z)
Increase and Conquer: Training Graph Neural Networks on Growing Graphs [116.03137405192356]
We consider the problem of learning a graphon neural network (WNN) by training GNNs on graphs sampled Bernoulli from the graphon. Inspired by these results, we propose an algorithm to learn GNNs on large-scale graphs that, starting from a moderate number of nodes, successively increases the size of the graph during training.
arXiv Detail & Related papers (2021-06-07T15:05:59Z)
GraphTheta: A Distributed Graph Neural Network Learning System With Flexible Training Strategy [5.466414428765544]
We present a new distributed graph learning system GraphTheta. It supports multiple training strategies and enables efficient and scalable learning on big graphs. This work represents the largest edge-attributed GNN learning task conducted on a billion-scale network in the literature.
arXiv Detail & Related papers (2021-04-21T14:51:33Z)
A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights. We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z)
Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs. In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings. We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z)
GPT-GNN: Generative Pre-Training of Graph Neural Networks [93.35945182085948]
Graph neural networks (GNNs) have been demonstrated to be powerful in modeling graph-structured data. We present the GPT-GNN framework to initialize GNNs by generative pre-training. We show that GPT-GNN significantly outperforms state-of-the-art GNN models without pre-training by up to 9.1% across various downstream tasks.
arXiv Detail & Related papers (2020-06-27T20:12:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.