Scaling Graph Neural Networks with Approximate PageRank
- URL: http://arxiv.org/abs/2007.01570v2
- Date: Tue, 5 Apr 2022 12:41:35 GMT
- Title: Scaling Graph Neural Networks with Approximate PageRank
- Authors: Aleksandar Bojchevski, Johannes Gasteiger, Bryan Perozzi, Amol Kapoor,
Martin Blais, Benedek R\'ozemberczki, Michal Lukasik, Stephan G\"unnemann
- Abstract summary: We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs.
In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings.
We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
- Score: 64.92311737049054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph neural networks (GNNs) have emerged as a powerful approach for solving
many network mining tasks. However, learning on large graphs remains a
challenge - many recently proposed scalable GNN approaches rely on an expensive
message-passing procedure to propagate information through the graph. We
present the PPRGo model which utilizes an efficient approximation of
information diffusion in GNNs resulting in significant speed gains while
maintaining state-of-the-art prediction performance. In addition to being
faster, PPRGo is inherently scalable, and can be trivially parallelized for
large datasets like those found in industry settings. We demonstrate that PPRGo
outperforms baselines in both distributed and single-machine training
environments on a number of commonly used academic graphs. To better analyze
the scalability of large-scale graph learning methods, we introduce a novel
benchmark graph with 12.4 million nodes, 173 million edges, and 2.8 million
node features. We show that training PPRGo from scratch and predicting labels
for all nodes in this graph takes under 2 minutes on a single machine, far
outpacing other baselines on the same graph. We discuss the practical
application of PPRGo to solve large-scale node classification problems at
Google.
Related papers
- Learning on Large Graphs using Intersecting Communities [13.053266613831447]
MPNNs iteratively update each node's representation in an input graph by aggregating messages from the node's neighbors.
MPNNs might quickly become prohibitive for large graphs provided they are not very sparse.
We propose approximating the input graph as an intersecting community graph (ICG) -- a combination of intersecting cliques.
arXiv Detail & Related papers (2024-05-31T09:26:26Z) - Spectral Greedy Coresets for Graph Neural Networks [61.24300262316091]
The ubiquity of large-scale graphs in node-classification tasks hinders the real-world applications of Graph Neural Networks (GNNs)
This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs based on their spectral embeddings.
Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs.
arXiv Detail & Related papers (2024-05-27T17:52:12Z) - Improving Graph Neural Networks at Scale: Combining Approximate PageRank
and CoreRank [20.948992161528466]
We propose a scalable solution to propagate information on Graph Neural Networks (GNNs)
The CorePPR model uses a learnable convex combination of the approximate personalised PageRank and the CoreRank to diffuse multi-hop neighbourhood information in GNNs.
We demonstrate that CorePPR outperforms PPRGo on large graphs where selecting the most influential nodes is particularly relevant for scalability.
arXiv Detail & Related papers (2022-11-08T13:51:49Z) - Training Graph Neural Networks on Growing Stochastic Graphs [114.75710379125412]
Graph Neural Networks (GNNs) rely on graph convolutions to exploit meaningful patterns in networked data.
We propose to learn GNNs on very large graphs by leveraging the limit object of a sequence of growing graphs, the graphon.
arXiv Detail & Related papers (2022-10-27T16:00:45Z) - SCARA: Scalable Graph Neural Networks with Feature-Oriented Optimization [23.609017952951454]
We propose SCARA, a scalable Graph Neural Network (GNN) with feature-oriented optimization for graph computation.
SCARA efficiently computes graph embedding from node features, and further selects and reuses feature results to reduce overhead.
It is efficient to process precomputation on the largest available billion-scale GNN dataset Papers100M (111M nodes, 1.6B edges) in 100 seconds.
arXiv Detail & Related papers (2022-07-19T10:32:11Z) - Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data.
We present a novel Graph Matching based GNN Pre-Training framework, called GMPT.
The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z) - RaWaNet: Enriching Graph Neural Network Input via Random Walks on Graphs [0.0]
Graph neural networks (GNNs) have gained increasing popularity and have shown very promising results for data that are represented by graphs.
We propose a random walk data processing of the graphs based on three selected lengths. Namely, (regular) walks of length 1 and 2, and a fractional walk of length $gamma in (0,1)$, in order to capture the different local and global dynamics on the graphs.
We test our method on various molecular datasets by passing the processed node features to the network in order to perform several classification and regression tasks.
arXiv Detail & Related papers (2021-09-15T20:04:01Z) - GraphTheta: A Distributed Graph Neural Network Learning System With
Flexible Training Strategy [5.466414428765544]
We present a new distributed graph learning system GraphTheta.
It supports multiple training strategies and enables efficient and scalable learning on big graphs.
This work represents the largest edge-attributed GNN learning task conducted on a billion-scale network in the literature.
arXiv Detail & Related papers (2021-04-21T14:51:33Z) - A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights.
We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z) - Distance Encoding: Design Provably More Powerful Neural Networks for
Graph Representation Learning [63.97983530843762]
Graph Neural Networks (GNNs) have achieved great success in graph representation learning.
GNNs generate identical representations for graph substructures that may in fact be very different.
More powerful GNNs, proposed recently by mimicking higher-order tests, are inefficient as they cannot sparsity of underlying graph structure.
We propose Distance Depiction (DE) as a new class of graph representation learning.
arXiv Detail & Related papers (2020-08-31T23:15:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.