SCARA: Scalable Graph Neural Networks with Feature-Oriented Optimization
- URL: http://arxiv.org/abs/2207.09179v1
- Date: Tue, 19 Jul 2022 10:32:11 GMT
- Title: SCARA: Scalable Graph Neural Networks with Feature-Oriented Optimization
- Authors: Ningyi Liao, Dingheng Mo, Siqiang Luo, Xiang Li, Pengcheng Yin
- Abstract summary: We propose SCARA, a scalable Graph Neural Network (GNN) with feature-oriented optimization for graph computation.
SCARA efficiently computes graph embedding from node features, and further selects and reuses feature results to reduce overhead.
It is efficient to process precomputation on the largest available billion-scale GNN dataset Papers100M (111M nodes, 1.6B edges) in 100 seconds.
- Score: 23.609017952951454
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent advances in data processing have stimulated the demand for learning
graphs of very large scales. Graph Neural Networks (GNNs), being an emerging
and powerful approach in solving graph learning tasks, are known to be
difficult to scale up. Most scalable models apply node-based techniques in
simplifying the expensive graph message-passing propagation procedure of GNN.
However, we find such acceleration insufficient when applied to million- or
even billion-scale graphs. In this work, we propose SCARA, a scalable GNN with
feature-oriented optimization for graph computation. SCARA efficiently computes
graph embedding from node features, and further selects and reuses feature
computation results to reduce overhead. Theoretical analysis indicates that our
model achieves sub-linear time complexity with a guaranteed precision in
propagation process as well as GNN training and inference. We conduct extensive
experiments on various datasets to evaluate the efficacy and efficiency of
SCARA. Performance comparison with baselines shows that SCARA can reach up to
100x graph propagation acceleration than current state-of-the-art methods with
fast convergence and comparable accuracy. Most notably, it is efficient to
process precomputation on the largest available billion-scale GNN dataset
Papers100M (111M nodes, 1.6B edges) in 100 seconds.
Related papers
- Faster Inference Time for GNNs using coarsening [1.323700980948722]
coarsening-based methods are used to reduce the graph into a smaller one, resulting in faster computation.
No previous research has tackled the cost during the inference.
This paper presents a novel approach to improve the scalability of GNNs through subgraph-based techniques.
arXiv Detail & Related papers (2024-10-19T06:27:24Z) - Diffusing to the Top: Boost Graph Neural Networks with Minimal Hyperparameter Tuning [33.948899558876604]
This work introduces a graph-conditioned latent diffusion framework (GNN-Diff) to generate high-performing GNNs.
We validate our method through 166 experiments across four graph tasks: node classification on small, large, and long-range graphs, as well as link prediction.
arXiv Detail & Related papers (2024-10-08T05:27:34Z) - Spectral Greedy Coresets for Graph Neural Networks [61.24300262316091]
The ubiquity of large-scale graphs in node-classification tasks hinders the real-world applications of Graph Neural Networks (GNNs)
This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs based on their spectral embeddings.
Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs.
arXiv Detail & Related papers (2024-05-27T17:52:12Z) - Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs.
Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors.
We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z) - GraphTheta: A Distributed Graph Neural Network Learning System With
Flexible Training Strategy [5.466414428765544]
We present a new distributed graph learning system GraphTheta.
It supports multiple training strategies and enables efficient and scalable learning on big graphs.
This work represents the largest edge-attributed GNN learning task conducted on a billion-scale network in the literature.
arXiv Detail & Related papers (2021-04-21T14:51:33Z) - Scalable Graph Neural Networks for Heterogeneous Graphs [12.44278942365518]
Graph neural networks (GNNs) are a popular class of parametric model for learning over graph-structured data.
Recent work has argued that GNNs primarily use the graph for feature smoothing, and have shown competitive results on benchmark tasks.
In this work, we ask whether these results can be extended to heterogeneous graphs, which encode multiple types of relationship between different entities.
arXiv Detail & Related papers (2020-11-19T06:03:35Z) - Combining Label Propagation and Simple Models Out-performs Graph Neural
Networks [52.121819834353865]
We show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs.
We call this overall procedure Correct and Smooth (C&S)
Our approach exceeds or nearly matches the performance of state-of-the-art GNNs on a wide variety of benchmarks.
arXiv Detail & Related papers (2020-10-27T02:10:52Z) - Robust Optimization as Data Augmentation for Large-scale Graphs [117.2376815614148]
We propose FLAG (Free Large-scale Adversarial Augmentation on Graphs), which iteratively augments node features with gradient-based adversarial perturbations during training.
FLAG is a general-purpose approach for graph data, which universally works in node classification, link prediction, and graph classification tasks.
arXiv Detail & Related papers (2020-10-19T21:51:47Z) - Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs.
In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings.
We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z) - Fast Graph Attention Networks Using Effective Resistance Based Graph
Sparsification [70.50751397870972]
FastGAT is a method to make attention based GNNs lightweight by using spectral sparsification to generate an optimal pruning of the input graph.
We experimentally evaluate FastGAT on several large real world graph datasets for node classification tasks.
arXiv Detail & Related papers (2020-06-15T22:07:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.