Related papers: On the Importance of Sampling in Learning Graph Convolutional Networks

On the Importance of Sampling in Learning Graph Convolutional Networks

URL: http://arxiv.org/abs/2103.02696v1
Date: Wed, 3 Mar 2021 21:31:23 GMT
Title: On the Importance of Sampling in Learning Graph Convolutional Networks
Authors: Weilin Cong, Morteza Ramezani, Mehrdad Mahdavi
Abstract summary: Graph Convolutional Networks (GCNs) have achieved impressive empirical advancement across a wide variety of graph-related applications. Despite their success, training GCNs on large graphs suffers from computational and memory issues. We describe and analyze a general textbftextitdoubly variance reduction schema that can accelerate any sampling method under the memory budget.
Score: 13.713485304798368
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph Convolutional Networks (GCNs) have achieved impressive empirical advancement across a wide variety of graph-related applications. Despite their great success, training GCNs on large graphs suffers from computational and memory issues. A potential path to circumvent these obstacles is sampling-based methods, where at each layer a subset of nodes is sampled. Although recent studies have empirically demonstrated the effectiveness of sampling-based methods, these works lack theoretical convergence guarantees under realistic settings and cannot fully leverage the information of evolving parameters during optimization. In this paper, we describe and analyze a general \textbf{\textit{doubly variance reduction}} schema that can accelerate any sampling method under the memory budget. The motivating impetus for the proposed schema is a careful analysis for the variance of sampling methods where it is shown that the induced variance can be decomposed into node embedding approximation variance (\emph{zeroth-order variance}) during forward propagation and layerwise-gradient variance (\emph{first-order variance}) during backward propagation. We theoretically analyze the convergence of the proposed schema and show that it enjoys an $\mathcal{O}(1/T)$ convergence rate. We complement our theoretical results by integrating the proposed schema in different sampling methods and applying them to different large real-world graphs. Code is public available at~\url{https://github.com/CongWeilin/SGCN.git}.

Related papers

Topology-Aware Dynamic Reweighting for Distribution Shifts on Graph [24.44321658238713]
Graph Neural Networks (GNNs) are widely used for node classification tasks but often fail to generalize when training and test nodes come from different distributions. We introduce the Topology-Aware Dynamic Reweighting (TAR) framework, which dynamically adjusts sample weights through gradient flow in the Wasserstein space during training. Our framework's superiority is demonstrated through standard testing on four graph OOD datasets and three class-imbalanced node classification datasets.
arXiv Detail & Related papers (2024-06-03T07:32:05Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
Graph Out-of-Distribution Generalization with Controllable Data Augmentation [51.17476258673232]
Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties. Due to the selection bias of training and testing data, distribution deviation is widespread. We propose OOD calibration to measure the distribution deviation of virtual samples.
arXiv Detail & Related papers (2023-08-16T13:10:27Z)
MixupExplainer: Generalizing Explanations for Graph Neural Networks with Data Augmentation [6.307753856507624]
Graph Neural Networks (GNNs) have received increasing attention due to their ability to learn from graph-structured data. Post-hoc instance-level explanation methods have been proposed to understand GNN predictions. We shed light on the existence of the distribution shifting issue in existing methods, which affects explanation quality.
arXiv Detail & Related papers (2023-07-15T15:46:38Z)
From Spectral Graph Convolutions to Large Scale Graph Convolutional Networks [0.0]
Graph Convolutional Networks (GCNs) have been shown to be a powerful concept that has been successfully applied to a large variety of tasks. We study the theory that paved the way to the definition of GCN, including related parts of classical graph theory.
arXiv Detail & Related papers (2022-07-12T16:57:08Z)
Optimal Propagation for Graph Neural Networks [51.08426265813481]
We propose a bi-level optimization approach for learning the optimal graph structure. We also explore a low-rank approximation model for further reducing the time complexity.
arXiv Detail & Related papers (2022-05-06T03:37:00Z)
Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders. Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector. We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z)
Scaling Up Graph Neural Networks Via Graph Coarsening [18.176326897605225]
Scalability of graph neural networks (GNNs) is one of the major challenges in machine learning. In this paper, we propose to use graph coarsening for scalable training of GNNs. We show that, simply applying off-the-shelf coarsening methods, we can reduce the number of nodes by up to a factor of ten without causing a noticeable downgrade in classification accuracy.
arXiv Detail & Related papers (2021-06-09T15:46:17Z)
Minimal Variance Sampling with Provable Guarantees for Fast Training of Graph Neural Networks [22.618779809748435]
Existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization.
arXiv Detail & Related papers (2020-06-24T16:49:29Z)
Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs) These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z)
Block-Approximated Exponential Random Graphs [77.4792558024487]
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs. We propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions. Our methods are scalable to sparse graphs consisting of millions of nodes.
arXiv Detail & Related papers (2020-02-14T11:42:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.