Related papers: Communication-Efficient Sampling for Distributed Training of Graph Convolutional Networks

Communication-Efficient Sampling for Distributed Training of Graph Convolutional Networks

URL: http://arxiv.org/abs/2101.07706v1
Date: Tue, 19 Jan 2021 16:12:44 GMT
Title: Communication-Efficient Sampling for Distributed Training of Graph Convolutional Networks
Authors: Peng Jiang, Masuma Akter Rumi
Abstract summary: Training Graph Convolutional Networks (GCNs) is expensive as it needs to aggregate data from neighboring nodes. Previous works have proposed various neighbor sampling methods that estimate the aggregation result based on a small number of sampled neighbors. We present an algorithm that determines the local sampling probabilities and makes sure our skewed neighbor sampling does not affect much the convergence of the training.
Score: 3.075766050800645
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training Graph Convolutional Networks (GCNs) is expensive as it needs to aggregate data recursively from neighboring nodes. To reduce the computation overhead, previous works have proposed various neighbor sampling methods that estimate the aggregation result based on a small number of sampled neighbors. Although these methods have successfully accelerated the training, they mainly focus on the single-machine setting. As real-world graphs are large, training GCNs in distributed systems is desirable. However, we found that the existing neighbor sampling methods do not work well in a distributed setting. Specifically, a naive implementation may incur a huge amount of communication of feature vectors among different machines. To address this problem, we propose a communication-efficient neighbor sampling method in this work. Our main idea is to assign higher sampling probabilities to the local nodes so that remote nodes are accessed less frequently. We present an algorithm that determines the local sampling probabilities and makes sure our skewed neighbor sampling does not affect much the convergence of the training. Our experiments with node classification benchmarks show that our method significantly reduces the communication overhead for distributed GCN training with little accuracy loss.

Related papers

Efficient Estimation of Shortest-Path Distance Distributions to Samples in Graphs [14.492861079799516]
We present an accurate and efficient framework for estimating the distribution of shortest-path distances to the sample. Our framework is faster than empirical methods and only requires the specification of degree distributions.
arXiv Detail & Related papers (2025-02-21T19:21:21Z)
Learning conditional distributions on continuous spaces [0.0]
We investigate sample-based learning of conditional distributions on multi-dimensional unit boxes. We employ two distinct clustering schemes: one based on a fixed-radius ball and the other on nearest neighbors. We propose to incorporate the nearest neighbors method into neural network training, as our empirical analysis indicates it has better performance in practice.
arXiv Detail & Related papers (2024-06-13T17:53:47Z)
Faster Convergence with Less Communication: Broadcast-Based Subgraph Sampling for Decentralized Learning over Wireless Networks [32.914407967052114]
$texttBASS$ is a broadcast-based subgraph sampling method designed to accelerate the convergence of D-SGD. We show that $texttBASS$ enables faster convergence with fewer transmission slots compared to existing link-based scheduling methods.
arXiv Detail & Related papers (2024-01-24T20:00:23Z)
Communication-Efficient Decentralized Federated Learning via One-Bit Compressive Sensing [52.402550431781805]
Decentralized federated learning (DFL) has gained popularity due to its practicality across various applications. Compared to the centralized version, training a shared model among a large number of nodes in DFL is more challenging. We develop a novel algorithm based on the framework of the inexact alternating direction method (iADM)
arXiv Detail & Related papers (2023-08-31T12:22:40Z)
A Maximum Log-Likelihood Method for Imbalanced Few-Shot Learning Tasks [3.2895195535353308]
We propose a new maximum log-likelihood metric for few-shot architectures. We demonstrate that the proposed metric achieves superior performance accuracy w.r.t. conventional similarity metrics. We also show that our algorithm achieves state-of-the-art transductive few-shot performance when the evaluation data is imbalanced.
arXiv Detail & Related papers (2022-11-26T21:31:00Z)
ScoreMix: A Scalable Augmentation Strategy for Training GANs with Limited Data [93.06336507035486]
Generative Adversarial Networks (GANs) typically suffer from overfitting when limited training data is available. We present ScoreMix, a novel and scalable data augmentation approach for various image synthesis tasks.
arXiv Detail & Related papers (2022-10-27T02:55:15Z)
Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining [58.10436813430554]
Mini-batch training of graph neural networks (GNNs) requires a lot of computation and data movement. We argue in favor of performing mini-batch training with neighborhood sampling in a distributed multi-GPU environment. We present a sequence of improvements to mitigate these bottlenecks, including a performance-engineered neighborhood sampler. We also conduct an empirical analysis that supports the use of sampling for inference, showing that test accuracies are not materially compromised.
arXiv Detail & Related papers (2021-10-16T02:41:35Z)
STEM: A Stochastic Two-Sided Momentum Algorithm Achieving Near-Optimal Sample and Communication Complexities for Federated Learning [58.6792963686231]
Federated Learning (FL) refers to the paradigm where multiple worker nodes (WNs) build a joint model by using local data. It is not clear how to choose the WNs' minimum update directions, the first minibatch sizes, and the local update frequency. We show that there is a trade-off curve between local update frequencies and local mini sizes, on which the above complexities can be maintained.
arXiv Detail & Related papers (2021-06-19T06:13:45Z)
Improving Generative Adversarial Networks with Local Coordinate Coding [150.24880482480455]
Generative adversarial networks (GANs) have shown remarkable success in generating realistic data from some predefined prior distribution. In practice, semantic information might be represented by some latent distribution learned from data. We propose an LCCGAN model with local coordinate coding (LCC) to improve the performance of generating data.
arXiv Detail & Related papers (2020-07-28T09:17:50Z)
Bandit Samplers for Training Graph Neural Networks [63.17765191700203]
Several sampling algorithms with variance reduction have been proposed for accelerating the training of Graph Convolution Networks (GCNs) These sampling algorithms are not applicable to more general graph neural networks (GNNs) where the message aggregator contains learned weights rather than fixed weights, such as Graph Attention Networks (GAT)
arXiv Detail & Related papers (2020-06-10T12:48:37Z)
One Size Fits All: Can We Train One Denoiser for All Noise Levels? [13.46272057205994]
It is often preferred to train one neural network estimator and apply it to all noise levels. The de facto protocol is to train the estimator with noisy samples whose noise are uniformly distributed. This paper addresses the sample problem from a minimax risk optimization perspective.
arXiv Detail & Related papers (2020-05-19T17:56:04Z)
Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks [100.14670789581811]
We train a graph convolutional network to fit the performance of sampled sub-networks. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates.
arXiv Detail & Related papers (2020-04-17T19:12:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.