Communication-efficient Decentralized Local SGD over Undirected Networks
- URL: http://arxiv.org/abs/2011.03255v1
- Date: Fri, 6 Nov 2020 09:34:00 GMT
- Title: Communication-efficient Decentralized Local SGD over Undirected Networks
- Authors: Tiancheng Qin, S. Rasoul Etesami, C\'esar A. Uribe
- Abstract summary: We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$.
We analyze the trade-off between the number of communication rounds and the computational effort of each agent.
Our results show that by using only $R=Omega(n)$ communication rounds, one can achieve an error that scales as $O(1/nT)$.
- Score: 2.3572498744567123
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the distributed learning problem where a network of $n$ agents
seeks to minimize a global function $F$. Agents have access to $F$ through
noisy gradients, and they can locally communicate with their neighbors a
network. We study the Decentralized Local SDG method, where agents perform a
number of local gradient steps and occasionally exchange information with their
neighbors. Previous algorithmic analysis efforts have focused on the specific
network topology (star topology) where a leader node aggregates all agents'
information. We generalize that setting to an arbitrary network by analyzing
the trade-off between the number of communication rounds and the computational
effort of each agent. We bound the expected optimality gap in terms of the
number of iterates $T$, the number of workers $n$, and the spectral gap of the
underlying network. Our main results show that by using only $R=\Omega(n)$
communication rounds, one can achieve an error that scales as $O({1}/{nT})$,
where the number of communication rounds is independent of $T$ and only depends
on the number of agents. Finally, we provide numerical evidence of our
theoretical results through experiments on real and synthetic data.
Related papers
- Cooperative Multi-Agent Reinforcement Learning: Asynchronous
Communication and Linear Function Approximation [77.09836892653176]
We study multi-agent reinforcement learning in the setting of episodic Markov decision processes.
We propose a provably efficient algorithm based on value that enable asynchronous communication.
We show that a minimal $Omega(dM)$ communication complexity is required to improve the performance through collaboration.
arXiv Detail & Related papers (2023-05-10T20:29:29Z) - Convergence Rates for Localized Actor-Critic in Networked Markov
Potential Games [12.704529528199062]
We introduce a class of networked Markov potential games in which agents are associated with nodes in a network.
Each agent has its own local potential function, and the reward of each agent depends only on the states and actions of the agents within a neighborhood.
This is the first finite-sample bound for multi-agent competitive games that does not depend on the number of agents.
arXiv Detail & Related papers (2023-03-08T20:09:58Z) - Communication-Efficient Topologies for Decentralized Learning with
$O(1)$ Consensus Rate [35.698182247676414]
Decentralized optimization is an emerging paradigm in distributed learning in which agents achieve network-wide solutions by peer-to-peer communication without the central server.
We show that the total number of iterations to reach a network-wide solution is affected by the speed at which the agents' information is mixed'' by communication.
We propose a new family of topologies, EquiTopo, which has an (almost) constant degree and a network-size-independent consensus rate.
arXiv Detail & Related papers (2022-10-14T15:02:01Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - On the Performance of Gradient Tracking with Local Updates [10.14746251086461]
We show that LU-GT has the same communication quality but allows arbitrary network topologies.
Numerical examples reveal that local updates can lower communication costs in certain regimes.
arXiv Detail & Related papers (2022-10-10T15:13:23Z) - A Simple and Provably Efficient Algorithm for Asynchronous Federated
Contextual Linear Bandits [77.09836892653176]
We study federated contextual linear bandits, where $M$ agents cooperate with each other to solve a global contextual linear bandit problem with the help of a central server.
We consider the asynchronous setting, where all agents work independently and the communication between one agent and the server will not trigger other agents' communication.
We prove that the regret of textttFedLinUCB is bounded by $tildeO(dsqrtsum_m=1M T_m)$ and the communication complexity is $tildeO(dM
arXiv Detail & Related papers (2022-07-07T06:16:19Z) - High-Dimensional Inference over Networks: Linear Convergence and
Statistical Guarantees [20.701475313495884]
We study a sparse linear regression over a network of agents, modeled as an undirected graph and no server node.
We analyze the convergence rate and statistical guarantees of a distributed projected gradient tracking-based algorithm.
arXiv Detail & Related papers (2022-01-21T01:26:08Z) - Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes.
An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps.
This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z) - Distributed Saddle-Point Problems Under Similarity [173.19083235638104]
We show that a given suboptimality $epsilon0$ is achieved master/workers networks in $Omegabig.
We then propose algorithms matching the lower bounds either types of networks (up to log-overs)
We assess effectiveness of the proposed algorithms on a robust logistic regression problem.
arXiv Detail & Related papers (2021-07-22T14:25:16Z) - Learning Connectivity for Data Distribution in Robot Teams [96.39864514115136]
We propose a task-agnostic, decentralized, low-latency method for data distribution in ad-hoc networks using Graph Neural Networks (GNN)
Our approach enables multi-agent algorithms based on global state information to function by ensuring it is available at each robot.
We train the distributed GNN communication policies via reinforcement learning using the average Age of Information as the reward function and show that it improves training stability compared to task-specific reward functions.
arXiv Detail & Related papers (2021-03-08T21:48:55Z) - Multi-Level Local SGD for Heterogeneous Hierarchical Networks [11.699472346137739]
We propose Multi-Level Local SGD, a distributed gradient method for a learning, non- objective framework in a heterogeneous network.
We first provide a unified mathematical that describes the Multi-Level Local SGD algorithm.
We then present a theoretical analysis of the algorithm.
arXiv Detail & Related papers (2020-07-27T19:14:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.