A Proximal Gradient Method With Probabilistic Multi-Gossip Communications for Decentralized Composite Optimization
- URL: http://arxiv.org/abs/2312.11861v2
- Date: Sun, 27 Oct 2024 07:37:14 GMT
- Title: A Proximal Gradient Method With Probabilistic Multi-Gossip Communications for Decentralized Composite Optimization
- Authors: Luyao Guo, Luqing Wang, Xinli Shi, Jinde Cao,
- Abstract summary: We propose a communication-efficient method MG-Skip for decentralized composite (smooth + nonsmooth) optimization.
We show that MG-Skip achieves the optimal communication complexity and confirm the benefits of local updates in the nonsmooth setup.
- Score: 36.777745196161035
- License:
- Abstract: Decentralized optimization methods with local updates have recently gained attention for their provable ability to communication acceleration. In these methods, nodes perform several iterations of local computations between the communication rounds. Nevertheless, this capability is effective only when the loss function is smooth and the network is sufficiently well-connected. In this paper, we propose a communication-efficient method MG-Skip with probabilistic local updates and multi-gossip communications for decentralized composite (smooth + nonsmooth) optimization, whose stepsize is independent of the number of local updates and the network topology. Without any additional condition for network connectivity, MG-Skip allows for the multi-gossip communications to be skipped in most iterations in the strongly convex setting, while its iteration complexity is $\mathcal{O}\left(\kappa \log \frac{1}{\epsilon}\right)$ and communication complexity is only $\mathcal{O}\left(\sqrt{\frac{\kappa}{(1-\rho)}} \log \frac{1}{\epsilon}\right)$, where $\kappa$ is the condition number of the loss function, $\rho$ reflects the connectivity of the network topology, and $\epsilon$ is the target accuracy. The theoretical results demonstrate that MG-Skip achieves the optimal communication complexity and confirm the benefits of local updates in the nonsmooth setup.
Related papers
- Cooperative Multi-Agent Reinforcement Learning: Asynchronous
Communication and Linear Function Approximation [77.09836892653176]
We study multi-agent reinforcement learning in the setting of episodic Markov decision processes.
We propose a provably efficient algorithm based on value that enable asynchronous communication.
We show that a minimal $Omega(dM)$ communication complexity is required to improve the performance through collaboration.
arXiv Detail & Related papers (2023-05-10T20:29:29Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - DADAO: Decoupled Accelerated Decentralized Asynchronous Optimization [0.0]
DADAO is the first decentralized, accelerated, asynchronous, primal, first-order algorithm to minimize a sum of $L$-smooth and $mu$-strongly convex functions distributed over a given network of size $n$.
We show that our algorithm requires $mathcalO(nsqrtchisqrtfracLmulog(frac1epsilon)$ local and only $mathcalO(nsqrtchisqrtfracLmulog(
arXiv Detail & Related papers (2022-07-26T08:47:54Z) - Pick your Neighbor: Local Gauss-Southwell Rule for Fast Asynchronous
Decentralized Optimization [37.85806148420504]
In decentralized optimization environments, each agent $i$ in a network of $n$ optimization nodes possesses a private function $f_i$.
We introduce an optimization-aware selection rule that chooses the neighbor with the highest dual cost improvement.
We show that the proposed set-wise GS rule achieves a speedup by a factor of up to the maximum degree in the network.
arXiv Detail & Related papers (2022-07-15T15:37:03Z) - Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes.
An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps.
This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z) - Distributed Saddle-Point Problems Under Similarity [173.19083235638104]
We show that a given suboptimality $epsilon0$ is achieved master/workers networks in $Omegabig.
We then propose algorithms matching the lower bounds either types of networks (up to log-overs)
We assess effectiveness of the proposed algorithms on a robust logistic regression problem.
arXiv Detail & Related papers (2021-07-22T14:25:16Z) - Communication-efficient SGD: From Local SGD to One-Shot Averaging [16.00658606157781]
We consider speeding up gradient descent (SGD) by parallelizing it across multiple workers.
We suggest a Local SGD scheme that communicates less overall by communicating less frequently as the number of iterations grows.
arXiv Detail & Related papers (2021-06-09T01:10:34Z) - The Min-Max Complexity of Distributed Stochastic Convex Optimization
with Intermittent Communication [61.60069865891783]
We resolve the min-max complexity of distributed convex optimization (up to a log factor) in the intermittent communication setting.
We present a novel lower bound with a matching upper bound that establishes an optimal algorithm.
arXiv Detail & Related papers (2021-02-02T16:18:29Z) - On the Benefits of Multiple Gossip Steps in Communication-Constrained
Decentralized Optimization [29.42301299741866]
We show that having $O(logfrac1epsilon)$ iterations with constant step size - $O(logfrac1epsilon)$ - enables convergence to within $epsilon$ of the optimal value for smooth non- compressed gradient objectives.
To our knowledge, this is the first work that derives the convergence results for non optimization under compressed communication compression parameters.
arXiv Detail & Related papers (2020-11-20T21:17:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.