Related papers: On the Communication Latency of Wireless Decentralized Learning

On the Communication Latency of Wireless Decentralized Learning

URL: http://arxiv.org/abs/2002.04069v1
Date: Mon, 10 Feb 2020 20:10:07 GMT
Title: On the Communication Latency of Wireless Decentralized Learning
Authors: Navid Naderializadeh
Abstract summary: We consider a wireless network comprising $n$ nodes located within a circular area of radius $R$. To enable gradient exchanges across the network, we assume each node communicates only with a set of neighboring nodes. We show that the communication delay for a single round of exchanging gradients on all the links throughout the network scales as $mathcalOleft(fracn2-3betabetalog nright)$.
Score: 12.977865337365856
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider a wireless network comprising $n$ nodes located within a circular area of radius $R$, which are participating in a decentralized learning algorithm to optimize a global objective function using their local datasets. To enable gradient exchanges across the network, we assume each node communicates only with a set of neighboring nodes, which are within a distance $R n^{-\beta}$ of itself, where $\beta\in(0,\frac{1}{2})$. We use tools from network information theory and random geometric graph theory to show that the communication delay for a single round of exchanging gradients on all the links throughout the network scales as $\mathcal{O}\left(\frac{n^{2-3\beta}}{\beta\log n}\right)$, increasing (at different rates) with both the number of nodes and the gradient exchange threshold distance.

Related papers

GNN-LoFI: a Novel Graph Neural Network through Localized Feature-based Histogram Intersection [51.608147732998994]
Graph neural networks are increasingly becoming the framework of choice for graph-based machine learning. We propose a new graph neural network architecture that substitutes classical message passing with an analysis of the local distribution of node features.
arXiv Detail & Related papers (2024-01-17T13:04:23Z)
Online Centralized Non-parametric Change-point Detection via Graph-based Likelihood-ratio Estimation [77.81487285123147]
Consider each node of a graph to be generating a data stream that is synchronized and observed at near real-time. At a change-point $tau$, a change occurs at a subset of nodes $C$, which affects the probability distribution of their associated node streams. We propose a novel kernel-based method to both detect $tau$ and localize $C$, based on the direct estimation of the likelihood-ratio between the post-change and the pre-change distributions of the node streams.
arXiv Detail & Related papers (2023-01-08T10:15:24Z)
Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching. Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z)
Push--Pull with Device Sampling [8.344476599818826]
We consider decentralized optimization problems in which a number of agents collaborate to minimize the average of their local functions by exchanging over an underlying communication graph. We propose an algorithm that combines gradient tracking and variance reduction over the entire network. Our theoretical analysis shows that the algorithm converges linearly, when the local objective functions are strongly convex.
arXiv Detail & Related papers (2022-06-08T18:18:18Z)
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons. Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z)
Decentralized Sparse Linear Regression via Gradient-Tracking: Linear Convergence and Statistical Guarantees [23.256961881716595]
We study a sparse linear regression over a network of agents, modeled as an undirected graph and no server node. We analyze the convergence rate and statistical guarantees of a distributed projected gradient tracking-based algorithm.
arXiv Detail & Related papers (2022-01-21T01:26:08Z)
Decentralized Multi-Task Stochastic Optimization With Compressed Communications [22.31884634659446]
The paper develops algorithms and obtains performance bounds for two different models of local information availability at the nodes. We show that deviation from the global minimum value and violations of the constraints are upper-bounded by $mathcalO(T-frac12)$ and $mathcalO(T-frac14)$.
arXiv Detail & Related papers (2021-12-23T05:54:42Z)
Distributed Saddle-Point Problems Under Similarity [173.19083235638104]
We show that a given suboptimality $epsilon0$ is achieved master/workers networks in $Omegabig. We then propose algorithms matching the lower bounds either types of networks (up to log-overs) We assess effectiveness of the proposed algorithms on a robust logistic regression problem.
arXiv Detail & Related papers (2021-07-22T14:25:16Z)
Node2Seq: Towards Trainable Convolutions in Graph Neural Networks [59.378148590027735]
We propose a graph network layer, known as Node2Seq, to learn node embeddings with explicitly trainable weights for different neighboring nodes. For a target node, our method sorts its neighboring nodes via attention mechanism and then employs 1D convolutional neural networks (CNNs) to enable explicit weights for information aggregation. In addition, we propose to incorporate non-local information for feature learning in an adaptive manner based on the attention scores.
arXiv Detail & Related papers (2021-01-06T03:05:37Z)
Communication-efficient Decentralized Local SGD over Undirected Networks [2.3572498744567123]
We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$. We analyze the trade-off between the number of communication rounds and the computational effort of each agent. Our results show that by using only $R=Omega(n)$ communication rounds, one can achieve an error that scales as $O(1/nT)$.
arXiv Detail & Related papers (2020-11-06T09:34:00Z)
Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths. Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z)
Multi-Level Local SGD for Heterogeneous Hierarchical Networks [11.699472346137739]
We propose Multi-Level Local SGD, a distributed gradient method for a learning, non- objective framework in a heterogeneous network. We first provide a unified mathematical that describes the Multi-Level Local SGD algorithm. We then present a theoretical analysis of the algorithm.
arXiv Detail & Related papers (2020-07-27T19:14:23Z)
Approximating Network Centrality Measures Using Node Embedding and Machine Learning [0.0]
We propose an approach for efficiently approximating node centralities for large networks using Neural Networks and Graph Embedding techniques. Our proposed model, entitled Network Centrality Approximation using Graph Embedding (NCA-GE), uses the adjacency matrix of a graph and a set of features for each node. NCA-GE has a time complexity of $O(|E|)$, $E$ being the set of edges of a graph, making it suitable for large networks.
arXiv Detail & Related papers (2020-06-29T21:19:40Z)
Fully Asynchronous Policy Evaluation in Distributed Reinforcement Learning over Networks [14.636457985379746]
This paper proposes a emphfully asynchronous scheme for the policy evaluation problem of distributed reinforcement learning (DisRL) over directed peer-to-peer networks. Without waiting for any other node of the network, each node can locally update its value function at any time by using (possibly delayed) information from its neighbors.
arXiv Detail & Related papers (2020-03-01T08:12:08Z)
EdgeNets:Edge Varying Graph Neural Networks [179.99395949679547]
This paper puts forth a general framework that unifies state-of-the-art graph neural networks (GNNs) through the concept of EdgeNet. An EdgeNet is a GNN architecture that allows different nodes to use different parameters to weigh the information of different neighbors. This is a general linear and local operation that a node can perform and encompasses under one formulation all existing graph convolutional neural networks (GCNNs) as well as graph attention networks (GATs)
arXiv Detail & Related papers (2020-01-21T15:51:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.