Communication-Efficient Topologies for Decentralized Learning with
$O(1)$ Consensus Rate
- URL: http://arxiv.org/abs/2210.07881v1
- Date: Fri, 14 Oct 2022 15:02:01 GMT
- Title: Communication-Efficient Topologies for Decentralized Learning with
$O(1)$ Consensus Rate
- Authors: Zhuoqing Song, Weijian Li, Kexin Jin, Lei Shi, Ming Yan, Wotao Yin,
Kun Yuan
- Abstract summary: Decentralized optimization is an emerging paradigm in distributed learning in which agents achieve network-wide solutions by peer-to-peer communication without the central server.
We show that the total number of iterations to reach a network-wide solution is affected by the speed at which the agents' information is mixed'' by communication.
We propose a new family of topologies, EquiTopo, which has an (almost) constant degree and a network-size-independent consensus rate.
- Score: 35.698182247676414
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decentralized optimization is an emerging paradigm in distributed learning in
which agents achieve network-wide solutions by peer-to-peer communication
without the central server. Since communication tends to be slower than
computation, when each agent communicates with only a few neighboring agents
per iteration, they can complete iterations faster than with more agents or a
central server. However, the total number of iterations to reach a network-wide
solution is affected by the speed at which the agents' information is ``mixed''
by communication. We found that popular communication topologies either have
large maximum degrees (such as stars and complete graphs) or are ineffective at
mixing information (such as rings and grids). To address this problem, we
propose a new family of topologies, EquiTopo, which has an (almost) constant
degree and a network-size-independent consensus rate that is used to measure
the mixing efficiency.
In the proposed family, EquiStatic has a degree of $\Theta(\ln(n))$, where
$n$ is the network size, and a series of time-dependent one-peer topologies,
EquiDyn, has a constant degree of 1. We generate EquiDyn through a certain
random sampling procedure. Both of them achieve an $n$-independent consensus
rate. We apply them to decentralized SGD and decentralized gradient tracking
and obtain faster communication and better convergence, theoretically and
empirically. Our code is implemented through BlueFog and available at
\url{https://github.com/kexinjinnn/EquiTopo}
Related papers
- FedScalar: A Communication efficient Federated Learning [0.0]
Federated learning (FL) has gained considerable popularity for distributed machine learning.
emphFedScalar enables agents to communicate updates using a single scalar.
arXiv Detail & Related papers (2024-10-03T07:06:49Z) - Asynchronous Local Computations in Distributed Bayesian Learning [8.516532665507835]
We propose gossip-based communication to leverage fast computations and reduce communication overhead simultaneously.
We observe faster initial convergence and improved performance accuracy, especially in the low data range.
We achieve on average 78% and over 90% classification accuracy respectively on the Gamma Telescope and mHealth data sets from the UCI ML repository.
arXiv Detail & Related papers (2023-11-06T20:11:41Z) - Communication-Efficient Decentralized Federated Learning via One-Bit
Compressive Sensing [52.402550431781805]
Decentralized federated learning (DFL) has gained popularity due to its practicality across various applications.
Compared to the centralized version, training a shared model among a large number of nodes in DFL is more challenging.
We develop a novel algorithm based on the framework of the inexact alternating direction method (iADM)
arXiv Detail & Related papers (2023-08-31T12:22:40Z) - Cooperative Multi-Agent Reinforcement Learning: Asynchronous
Communication and Linear Function Approximation [77.09836892653176]
We study multi-agent reinforcement learning in the setting of episodic Markov decision processes.
We propose a provably efficient algorithm based on value that enable asynchronous communication.
We show that a minimal $Omega(dM)$ communication complexity is required to improve the performance through collaboration.
arXiv Detail & Related papers (2023-05-10T20:29:29Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Exponential Graph is Provably Efficient for Decentralized Deep Training [30.817705471352614]
We study so-called exponential graphs where every node is connected to $O(log(n)$ neighbors and $n$ is the total number of nodes.
This work proves such graphs can lead to both fast communication and effective averaging simultaneously.
We also discover that a sequence of $log(n)$ one-peer exponential graphs, in which each node communicates to one single neighbor per iteration, can together achieve exact averaging.
arXiv Detail & Related papers (2021-10-26T02:33:39Z) - Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes.
An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps.
This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z) - Distributed Saddle-Point Problems Under Similarity [173.19083235638104]
We show that a given suboptimality $epsilon0$ is achieved master/workers networks in $Omegabig.
We then propose algorithms matching the lower bounds either types of networks (up to log-overs)
We assess effectiveness of the proposed algorithms on a robust logistic regression problem.
arXiv Detail & Related papers (2021-07-22T14:25:16Z) - Communication-efficient Decentralized Local SGD over Undirected Networks [2.3572498744567123]
We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$.
We analyze the trade-off between the number of communication rounds and the computational effort of each agent.
Our results show that by using only $R=Omega(n)$ communication rounds, one can achieve an error that scales as $O(1/nT)$.
arXiv Detail & Related papers (2020-11-06T09:34:00Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.