Related papers: Rethinking Efficiency and Redundancy in Training Large-scale Graphs

Rethinking Efficiency and Redundancy in Training Large-scale Graphs

URL: http://arxiv.org/abs/2209.00800v1
Date: Fri, 2 Sep 2022 03:25:32 GMT
Title: Rethinking Efficiency and Redundancy in Training Large-scale Graphs
Authors: Xin Liu, Xunbin Xiong, Mingyu Yan, Runzhen Xue, Shirui Pan, Xiaochun Ye, Dongrui Fan
Abstract summary: We argue that redundancy exists in large-scale graphs and will degrade the training efficiency. Despite recent advances in sampling-based training methods, sampling-based GNNs generally overlook the redundancy issue. We propose DropReef to detect and drop the redundancy in large-scale graphs once and for all.
Score: 26.982614602436655
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large-scale graphs are ubiquitous in real-world scenarios and can be trained by Graph Neural Networks (GNNs) to generate representation for downstream tasks. Given the abundant information and complex topology of a large-scale graph, we argue that redundancy exists in such graphs and will degrade the training efficiency. Unfortunately, the model scalability severely restricts the efficiency of training large-scale graphs via vanilla GNNs. Despite recent advances in sampling-based training methods, sampling-based GNNs generally overlook the redundancy issue. It still takes intolerable time to train these models on large-scale graphs. Thereby, we propose to drop redundancy and improve efficiency of training large-scale graphs with GNNs, by rethinking the inherent characteristics in a graph. In this paper, we pioneer to propose a once-for-all method, termed DropReef, to drop the redundancy in large-scale graphs. Specifically, we first conduct preliminary experiments to explore potential redundancy in large-scale graphs. Next, we present a metric to quantify the neighbor heterophily of all nodes in a graph. Based on both experimental and theoretical analysis, we reveal the redundancy in a large-scale graph, i.e., nodes with high neighbor heterophily and a great number of neighbors. Then, we propose DropReef to detect and drop the redundancy in large-scale graphs once and for all, helping reduce the training time while ensuring no sacrifice in the model accuracy. To demonstrate the effectiveness of DropReef, we apply it to recent state-of-the-art sampling-based GNNs for training large-scale graphs, owing to the high precision of such models. With DropReef leveraged, the training efficiency of models can be greatly promoted. DropReef is highly compatible and is offline performed, benefiting the state-of-the-art sampling-based GNNs in the present and future to a significant extent.

Related papers

Graph Coarsening via Supervised Granular-Ball for Scalable Graph Neural Network Training [30.354103857690777]
We employ granular-ball computing to effectively compress graph data. We construct a coarsened graph network by iteratively splitting the graph into granular-balls based on a purity threshold. Our algorithm can adaptively perform splitting without requiring a predefined coarsening rate.
arXiv Detail & Related papers (2024-12-18T13:36:03Z)
Stealing Training Graphs from Graph Neural Networks [54.52392250297907]
Graph Neural Networks (GNNs) have shown promising results in modeling graphs in various tasks. As neural networks can memorize the training samples, the model parameters of GNNs have a high risk of leaking private training data. We investigate a novel problem of stealing graphs from trained GNNs.
arXiv Detail & Related papers (2024-11-17T23:15:36Z)
Faster Inference Time for GNNs using coarsening [1.323700980948722]
coarsening-based methods are used to reduce the graph into a smaller one, resulting in faster computation. No previous research has tackled the cost during the inference. This paper presents a novel approach to improve the scalability of GNNs through subgraph-based techniques.
arXiv Detail & Related papers (2024-10-19T06:27:24Z)
Do We Really Need Graph Convolution During Training? Light Post-Training Graph-ODE for Efficient Recommendation [34.93725892725111]
Graph convolution networks (GCNs) in training recommender systems (RecSys) have been persistent concerns. This paper presents a critical examination of the necessity of graph convolutions during the training phase. We introduce an innovative alternative: the Light Post-Training Graph Ordinary-Differential-Equation (LightGODE)
arXiv Detail & Related papers (2024-07-26T17:59:32Z)
Enhancing Size Generalization in Graph Neural Networks through Disentangled Representation Learning [7.448831299106425]
DISGEN is a model-agnostic framework designed to disentangle size factors from graph representations. Our empirical results show that DISGEN outperforms the state-of-the-art models by up to 6% on real-world datasets.
arXiv Detail & Related papers (2024-06-07T03:19:24Z)
Graph Unlearning with Efficient Partial Retraining [28.433619085748447]
Graph Neural Networks (GNNs) have achieved remarkable success in various real-world applications. GNNs may be trained on undesirable graph data, which can degrade their performance and reliability. We propose GraphRevoker, a novel graph unlearning framework that better maintains the model utility of unlearnable GNNs.
arXiv Detail & Related papers (2024-03-12T06:22:10Z)
Learning to Reweight for Graph Neural Network [63.978102332612906]
Graph Neural Networks (GNNs) show promising results for graph tasks. Existing GNNs' generalization ability will degrade when there exist distribution shifts between testing and training graph data. We propose a novel nonlinear graph decorrelation method, which can substantially improve the out-of-distribution generalization ability.
arXiv Detail & Related papers (2023-12-19T12:25:10Z)
Fast and Effective GNN Training with Linearized Random Spanning Trees [20.73637495151938]
We present a new effective and scalable framework for training GNNs in node classification tasks. Our approach progressively refines the GNN weights on an extensive sequence of random spanning trees. The sparse nature of these path graphs substantially lightens the computational burden of GNN training.
arXiv Detail & Related papers (2023-06-07T23:12:42Z)
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z)
Scaling R-GCN Training with Graph Summarization [71.06855946732296]
Training of Relation Graph Convolutional Networks (R-GCN) does not scale well with the size of the graph. In this work, we experiment with the use of graph summarization techniques to compress the graph. We obtain reasonable results on the AIFB, MUTAG and AM datasets.
arXiv Detail & Related papers (2022-03-05T00:28:43Z)
Neural Graph Matching for Pre-training Graph Neural Networks [72.32801428070749]
Graph neural networks (GNNs) have been shown powerful capacity at modeling structural data. We present a novel Graph Matching based GNN Pre-Training framework, called GMPT. The proposed method can be applied to fully self-supervised pre-training and coarse-grained supervised pre-training.
arXiv Detail & Related papers (2022-03-03T09:53:53Z)
OOD-GNN: Out-of-Distribution Generalized Graph Neural Network [73.67049248445277]
Graph neural networks (GNNs) have achieved impressive performance when testing and training graph data come from identical distribution. Existing GNNs lack out-of-distribution generalization abilities so that their performance substantially degrades when there exist distribution shifts between testing and training graph data. We propose an out-of-distribution generalized graph neural network (OOD-GNN) for achieving satisfactory performance on unseen testing graphs that have different distributions with training graphs.
arXiv Detail & Related papers (2021-12-07T16:29:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.