Related papers: VISAGNN: Versatile Staleness-Aware Efficient Training on Large-Scale Graphs

VISAGNN: Versatile Staleness-Aware Efficient Training on Large-Scale Graphs

URL: http://arxiv.org/abs/2511.12434v1
Date: Sun, 16 Nov 2025 03:25:45 GMT
Title: VISAGNN: Versatile Staleness-Aware Efficient Training on Large-Scale Graphs
Authors: Rui Xue,
Abstract summary: Graph Neural Networks (GNNs) have shown exceptional success in graph representation learning and a wide range of real-world applications.<n> scaling deeper GNNs poses challenges due to the neighbor explosion problem when training on large-scale graphs.<n>We propose a novel VersatIle Staleness-Aware GNN, named VISAGNN, which dynamically and adaptively incorporates staleness criteria into the large-scale GNN training process.
Score: 1.6210884160768364
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph Neural Networks (GNNs) have shown exceptional success in graph representation learning and a wide range of real-world applications. However, scaling deeper GNNs poses challenges due to the neighbor explosion problem when training on large-scale graphs. To mitigate this, a promising class of GNN training algorithms utilizes historical embeddings to reduce computation and memory costs while preserving the expressiveness of the model. These methods leverage historical embeddings for out-of-batch nodes, effectively approximating full-batch training without losing any neighbor information-a limitation found in traditional sampling methods. However, the staleness of these historical embeddings often introduces significant bias, acting as a bottleneck that can adversely affect model performance. In this paper, we propose a novel VersatIle Staleness-Aware GNN, named VISAGNN, which dynamically and adaptively incorporates staleness criteria into the large-scale GNN training process. By embedding staleness into the message passing mechanism, loss function, and historical embeddings during training, our approach enables the model to adaptively mitigate the negative effects of stale embeddings, thereby reducing estimation errors and enhancing downstream accuracy. Comprehensive experiments demonstrate the effectiveness of our method in overcoming the staleness issue of existing historical embedding techniques, showcasing its superior performance and efficiency on large-scale benchmarks, along with significantly faster convergence.

Related papers

Towards Scalable and Deep Graph Neural Networks via Noise Masking [59.058558158296265]
Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks.<n> scaling them to large graphs is challenging due to the high computational and storage costs.<n>We present random walk with noise masking (RMask), a plug-and-play module compatible with the existing model-simplification works.
arXiv Detail & Related papers (2024-12-19T07:48:14Z)
FIT-GNN: Faster Inference Time for GNNs that 'FIT' in Memory Using Coarsening [1.1345413192078595]
This paper presents a novel approach to improve the scalability of Graph Neural Networks (GNNs) by reducing computational burden during the inference phase using graph coarsening.<n>Our study extends the application of graph coarsening for graph-level tasks, including graph classification and graph regression.<n>Results show that the proposed method achieves orders of magnitude improvements in single-node inference time compared to traditional approaches.
arXiv Detail & Related papers (2024-10-19T06:27:24Z)
Haste Makes Waste: A Simple Approach for Scaling Graph Neural Networks [37.41604955004456]
Graph neural networks (GNNs) have demonstrated remarkable success in graph representation learning. Various sampling approaches have been proposed to scale GNNs to applications with large-scale graphs.
arXiv Detail & Related papers (2024-10-07T18:29:02Z)
Label Deconvolution for Node Representation Learning on Large-scale Attributed Graphs against Learning Bias [72.33336385797944]
We propose an efficient label regularization technique, namely Label Deconvolution (LD), to alleviate the learning bias.<n>We show that LD significantly outperforms state-of-the-art methods on Open Graph Benchmark datasets.
arXiv Detail & Related papers (2023-09-26T13:09:43Z)
Staleness-Alleviated Distributed GNN Training via Online Dynamic-Embedding Prediction [13.575053193557697]
This paper proposes SAT (Staleness-Alleviated Training), a novel and scalable distributed GNN training framework. The key idea of SAT is to model the GNN's embedding evolution as a temporal graph and build a model upon it to predict future embedding. Empirically, we demonstrate that SAT can effectively reduce embedding staleness and thus achieve better performance and convergence speed.
arXiv Detail & Related papers (2023-08-25T16:10:44Z)
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z)
Distributed Graph Neural Network Training with Periodic Historical Embedding Synchronization [9.503080586294406]
Graph Neural Networks (GNNs) are prevalent in various applications such as social network, recommender systems, and knowledge graphs. Traditional sampling-based methods accelerate GNN by dropping edges and nodes, which impairs the graph integrity and model performance. This paper proposes DIstributed Graph Embedding SynchronizaTion (DIGEST), a novel distributed GNN training framework.
arXiv Detail & Related papers (2022-05-31T18:44:53Z)
CAP: Co-Adversarial Perturbation on Weights and Features for Improving Generalization of Graph Neural Networks [59.692017490560275]
Adversarial training has been widely demonstrated to improve model's robustness against adversarial attacks. It remains unclear how the adversarial training could improve the generalization abilities of GNNs in the graph analytics problem. We construct the co-adversarial perturbation (CAP) optimization problem in terms of weights and features, and design the alternating adversarial perturbation algorithm to flatten the weight and feature loss landscapes alternately.
arXiv Detail & Related papers (2021-10-28T02:28:13Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
Towards an Efficient and General Framework of Robust Training for Graph Neural Networks [96.93500886136532]
Graph Neural Networks (GNNs) have made significant advances on several fundamental inference tasks. Despite GNNs' impressive performance, it has been observed that carefully crafted perturbations on graph structures lead them to make wrong predictions. We propose a general framework which leverages the greedy search algorithms and zeroth-order methods to obtain robust GNNs.
arXiv Detail & Related papers (2020-02-25T15:17:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.