Related papers: Neighbor-Sampling Based Momentum Stochastic Methods for Training Graph Neural Networks

Neighbor-Sampling Based Momentum Stochastic Methods for Training Graph Neural Networks

URL: http://arxiv.org/abs/2508.00267v1
Date: Fri, 01 Aug 2025 02:22:03 GMT
Title: Neighbor-Sampling Based Momentum Stochastic Methods for Training Graph Neural Networks
Authors: Molly Noel, Gabriel Mancino-Ball, Yangyang Xu,
Abstract summary: Graph convolutional networks (GCNs) are a powerful tool for graph representation learning.<n>In this paper, we present several practical elements from modern deep learning algorithms.<n>Under standard assumptions for Adam-type methods, we show that our methods enjoy the optimal rate.
Score: 12.017727800531556
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph convolutional networks (GCNs) are a powerful tool for graph representation learning. Due to the recursive neighborhood aggregations employed by GCNs, efficient training methods suffer from a lack of theoretical guarantees or are missing important practical elements from modern deep learning algorithms, such as adaptivity and momentum. In this paper, we present several neighbor-sampling (NS) based Adam-type stochastic methods for solving a nonconvex GCN training problem. We utilize the control variate technique proposed by [1] to reduce the stochastic error caused by neighbor sampling. Under standard assumptions for Adam-type methods, we show that our methods enjoy the optimal convergence rate. In addition, we conduct extensive numerical experiments on node classification tasks with several benchmark datasets. The results demonstrate superior performance of our methods over classic NS-based SGD that also uses the control-variate technique, especially for large-scale graph datasets. Our code is available at https://github.com/RPI-OPT/CV-ADAM-GNN .

Related papers

Sparse Decomposition of Graph Neural Networks [20.768412002413843]
We propose an approach to reduce the number of nodes that are included during aggregation.<n>We achieve this through a sparse decomposition, learning to approximate node representations using a weighted sum of linearly transformed features.<n>We demonstrate via extensive experiments that our method outperforms other baselines designed for inference speedup.
arXiv Detail & Related papers (2024-10-25T17:52:16Z)
Ensemble Quadratic Assignment Network for Graph Matching [52.20001802006391]
Graph matching is a commonly used technique in computer vision and pattern recognition. Recent data-driven approaches have improved the graph matching accuracy remarkably. We propose a graph neural network (GNN) based approach to combine the advantages of data-driven and traditional methods.
arXiv Detail & Related papers (2024-03-11T06:34:05Z)
Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs. Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors. We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z)
Deep Graph Neural Networks via Posteriori-Sampling-based Node-Adaptive Residual Module [65.81781176362848]
Graph Neural Networks (GNNs) can learn from graph-structured data through neighborhood information aggregation. As the number of layers increases, node representations become indistinguishable, which is known as over-smoothing. We propose a textbfPosterior-Sampling-based, Node-distinguish Residual module (PSNR).
arXiv Detail & Related papers (2023-05-09T12:03:42Z)
Variational Sparse Coding with Learned Thresholding [6.737133300781134]
We propose a new approach to variational sparse coding that allows us to learn sparse distributions by thresholding samples. We first evaluate and analyze our method by training a linear generator, showing that it has superior performance, statistical efficiency, and gradient estimation.
arXiv Detail & Related papers (2022-05-07T14:49:50Z)
Scalable Consistency Training for Graph Neural Networks via Self-Ensemble Self-Distillation [13.815063206114713]
We introduce a novel consistency training method to improve accuracy of graph neural networks (GNNs) For a target node we generate different neighborhood expansions, and distill the knowledge of the average of the predictions to the GNN. Our method approximates the expected prediction of the possible neighborhood samples and practically only requires a few samples.
arXiv Detail & Related papers (2021-10-12T19:24:42Z)
Scaling Up Graph Neural Networks Via Graph Coarsening [18.176326897605225]
Scalability of graph neural networks (GNNs) is one of the major challenges in machine learning. In this paper, we propose to use graph coarsening for scalable training of GNNs. We show that, simply applying off-the-shelf coarsening methods, we can reduce the number of nodes by up to a factor of ten without causing a noticeable downgrade in classification accuracy.
arXiv Detail & Related papers (2021-06-09T15:46:17Z)
Progressive Spatio-Temporal Graph Convolutional Network for Skeleton-Based Human Action Recognition [97.14064057840089]
We propose a method to automatically find a compact and problem-specific network for graph convolutional networks in a progressive manner. Experimental results on two datasets for skeleton-based human action recognition indicate that the proposed method has competitive or even better classification performance.
arXiv Detail & Related papers (2020-11-11T09:57:49Z)
Combining Label Propagation and Simple Models Out-performs Graph Neural Networks [52.121819834353865]
We show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs. We call this overall procedure Correct and Smooth (C&S) Our approach exceeds or nearly matches the performance of state-of-the-art GNNs on a wide variety of benchmarks.
arXiv Detail & Related papers (2020-10-27T02:10:52Z)
Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs) These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z)
Towards an Efficient and General Framework of Robust Training for Graph Neural Networks [96.93500886136532]
Graph Neural Networks (GNNs) have made significant advances on several fundamental inference tasks. Despite GNNs' impressive performance, it has been observed that carefully crafted perturbations on graph structures lead them to make wrong predictions. We propose a general framework which leverages the greedy search algorithms and zeroth-order methods to obtain robust GNNs.
arXiv Detail & Related papers (2020-02-25T15:17:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.