Acceleration in Distributed Optimization Under Similarity
- URL: http://arxiv.org/abs/2110.12347v1
- Date: Sun, 24 Oct 2021 04:03:00 GMT
- Title: Acceleration in Distributed Optimization Under Similarity
- Authors: Ye Tian, Gesualdo Scutari, Tianyu Cao, Alexander Gasnikov
- Abstract summary: We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes.
An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps.
This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
- Score: 72.54787082152278
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study distributed (strongly convex) optimization problems over a network
of agents, with no centralized nodes. The loss functions of the agents are
assumed to be similar, due to statistical data similarity or otherwise. In
order to reduce the number of communications to reach a solution accuracy, we
proposed a preconditioned, accelerated distributed method. An
$\varepsilon$-solution is achieved in
$\tilde{\mathcal{O}}\big(\sqrt{\frac{\beta/\mu}{(1-\rho)}}\log1/\varepsilon\big)$
number of communications steps, where $\beta/\mu$ is the relative condition
number between the global and local loss functions, and $\rho$ characterizes
the connectivity of the network. This rate matches (up to poly-log factors) for
the first time lower complexity communication bounds of distributed
gossip-algorithms applied to the class of problems of interest. Numerical
results show significant communication savings with respect to existing
accelerated distributed schemes, especially when solving ill-conditioned
problems.
Related papers
- Lower Bounds and Optimal Algorithms for Non-Smooth Convex Decentralized Optimization over Time-Varying Networks [57.24087627267086]
We consider the task of minimizing the sum of convex functions stored in a decentralized manner across the nodes of a communication network.
Lower bounds on the number of decentralized communications and (sub)gradient computations required to solve the problem have been established.
We develop the first optimal algorithm that matches these lower bounds and offers substantially improved theoretical performance compared to the existing state of the art.
arXiv Detail & Related papers (2024-05-28T10:28:45Z) - Rate Analysis of Coupled Distributed Stochastic Approximation for Misspecified Optimization [0.552480439325792]
We consider an $n$ agents distributed optimization problem with imperfect information characterized in a parametric sense.
We propose a coupled distributed approximation algorithm, in which every agent updates the current beliefs of its unknown parameter.
We quantitatively characterize the factors that affect the algorithm performance, and prove that the mean-squared error of the decision variable is bounded by $mathcalO(frac1nk)+mathcalOleft(frac1sqrtn (1-rho_w)right)frac1k1.5
arXiv Detail & Related papers (2024-04-21T14:18:49Z) - DASA: Delay-Adaptive Multi-Agent Stochastic Approximation [64.32538247395627]
We consider a setting in which $N$ agents aim to speedup a common Approximation problem by acting in parallel and communicating with a central server.
To mitigate the effect of delays and stragglers, we propose textttDASA, a Delay-Adaptive algorithm for multi-agent Approximation.
arXiv Detail & Related papers (2024-03-25T22:49:56Z) - A Proximal Gradient Method With Probabilistic Multi-Gossip Communications for Decentralized Composite Optimization [36.777745196161035]
We propose a communication-efficient method MG-Skip for decentralized composite (smooth + nonsmooth) optimization.
We show that MG-Skip achieves the optimal communication complexity and confirm the benefits of local updates in the nonsmooth setup.
arXiv Detail & Related papers (2023-12-19T05:13:16Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Distributed gradient-based optimization in the presence of dependent
aperiodic communication [4.34720256795424]
Iterative distributed optimization algorithms involve multiple agents that communicate with each other, over time, in order to minimize/maximize a global objective.
In the presence of unreliable communication networks, the Age-of-Information (AoI), which measures the freshness of data received, may be large and hence hinder algorithmic convergence.
We show that convergence is guaranteed provided the random variables associated with the AoI processes areally dominated by a random variable with finite first moment.
arXiv Detail & Related papers (2022-01-27T06:44:04Z) - High-Dimensional Inference over Networks: Linear Convergence and
Statistical Guarantees [20.701475313495884]
We study a sparse linear regression over a network of agents, modeled as an undirected graph and no server node.
We analyze the convergence rate and statistical guarantees of a distributed projected gradient tracking-based algorithm.
arXiv Detail & Related papers (2022-01-21T01:26:08Z) - On Accelerating Distributed Convex Optimizations [0.0]
This paper studies a distributed multi-agent convex optimization problem.
We show that the proposed algorithm converges linearly with an improved rate of convergence than the traditional and adaptive gradient-descent methods.
We demonstrate our algorithm's superior performance compared to prominent distributed algorithms for solving real logistic regression problems.
arXiv Detail & Related papers (2021-08-19T13:19:54Z) - Distributed Saddle-Point Problems Under Similarity [173.19083235638104]
We show that a given suboptimality $epsilon0$ is achieved master/workers networks in $Omegabig.
We then propose algorithms matching the lower bounds either types of networks (up to log-overs)
We assess effectiveness of the proposed algorithms on a robust logistic regression problem.
arXiv Detail & Related papers (2021-07-22T14:25:16Z) - A Linearly Convergent Algorithm for Decentralized Optimization: Sending
Less Bits for Free! [72.31332210635524]
Decentralized optimization methods enable on-device training of machine learning models without a central coordinator.
We propose a new randomized first-order method which tackles the communication bottleneck by applying randomized compression operators.
We prove that our method can solve the problems without any increase in the number of communications compared to the baseline.
arXiv Detail & Related papers (2020-11-03T13:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.