On the Communication Latency of Wireless Decentralized Learning
- URL: http://arxiv.org/abs/2002.04069v1
- Date: Mon, 10 Feb 2020 20:10:07 GMT
- Title: On the Communication Latency of Wireless Decentralized Learning
- Authors: Navid Naderializadeh
- Abstract summary: We consider a wireless network comprising $n$ nodes located within a circular area of radius $R$.
To enable gradient exchanges across the network, we assume each node communicates only with a set of neighboring nodes.
We show that the communication delay for a single round of exchanging gradients on all the links throughout the network scales as $mathcalOleft(fracn2-3betabetalog nright)$.
- Score: 12.977865337365856
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a wireless network comprising $n$ nodes located within a circular
area of radius $R$, which are participating in a decentralized learning
algorithm to optimize a global objective function using their local datasets.
To enable gradient exchanges across the network, we assume each node
communicates only with a set of neighboring nodes, which are within a distance
$R n^{-\beta}$ of itself, where $\beta\in(0,\frac{1}{2})$. We use tools from
network information theory and random geometric graph theory to show that the
communication delay for a single round of exchanging gradients on all the links
throughout the network scales as
$\mathcal{O}\left(\frac{n^{2-3\beta}}{\beta\log n}\right)$, increasing (at
different rates) with both the number of nodes and the gradient exchange
threshold distance.
Related papers
- GNN-LoFI: a Novel Graph Neural Network through Localized Feature-based
Histogram Intersection [51.608147732998994]
Graph neural networks are increasingly becoming the framework of choice for graph-based machine learning.
We propose a new graph neural network architecture that substitutes classical message passing with an analysis of the local distribution of node features.
arXiv Detail & Related papers (2024-01-17T13:04:23Z) - Online Centralized Non-parametric Change-point Detection via Graph-based
Likelihood-ratio Estimation [77.81487285123147]
Consider each node of a graph to be generating a data stream that is synchronized and observed at near real-time.
At a change-point $tau$, a change occurs at a subset of nodes $C$, which affects the probability distribution of their associated node streams.
We propose a novel kernel-based method to both detect $tau$ and localize $C$, based on the direct estimation of the likelihood-ratio between the post-change and the pre-change distributions of the node streams.
arXiv Detail & Related papers (2023-01-08T10:15:24Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Push--Pull with Device Sampling [8.344476599818826]
We consider decentralized optimization problems in which a number of agents collaborate to minimize the average of their local functions by exchanging over an underlying communication graph.
We propose an algorithm that combines gradient tracking and variance reduction over the entire network.
Our theoretical analysis shows that the algorithm converges linearly, when the local objective functions are strongly convex.
arXiv Detail & Related papers (2022-06-08T18:18:18Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - Decentralized Multi-Task Stochastic Optimization With Compressed
Communications [22.31884634659446]
The paper develops algorithms and obtains performance bounds for two different models of local information availability at the nodes.
We show that deviation from the global minimum value and violations of the constraints are upper-bounded by $mathcalO(T-frac12)$ and $mathcalO(T-frac14)$.
arXiv Detail & Related papers (2021-12-23T05:54:42Z) - Distributed Saddle-Point Problems Under Similarity [173.19083235638104]
We show that a given suboptimality $epsilon0$ is achieved master/workers networks in $Omegabig.
We then propose algorithms matching the lower bounds either types of networks (up to log-overs)
We assess effectiveness of the proposed algorithms on a robust logistic regression problem.
arXiv Detail & Related papers (2021-07-22T14:25:16Z) - Communication-efficient Decentralized Local SGD over Undirected Networks [2.3572498744567123]
We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$.
We analyze the trade-off between the number of communication rounds and the computational effort of each agent.
Our results show that by using only $R=Omega(n)$ communication rounds, one can achieve an error that scales as $O(1/nT)$.
arXiv Detail & Related papers (2020-11-06T09:34:00Z) - Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths.
Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z) - Multi-Level Local SGD for Heterogeneous Hierarchical Networks [11.699472346137739]
We propose Multi-Level Local SGD, a distributed gradient method for a learning, non- objective framework in a heterogeneous network.
We first provide a unified mathematical that describes the Multi-Level Local SGD algorithm.
We then present a theoretical analysis of the algorithm.
arXiv Detail & Related papers (2020-07-27T19:14:23Z) - Fully Asynchronous Policy Evaluation in Distributed Reinforcement
Learning over Networks [14.636457985379746]
This paper proposes a emphfully asynchronous scheme for the policy evaluation problem of distributed reinforcement learning (DisRL) over directed peer-to-peer networks.
Without waiting for any other node of the network, each node can locally update its value function at any time by using (possibly delayed) information from its neighbors.
arXiv Detail & Related papers (2020-03-01T08:12:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.