Pick your Neighbor: Local Gauss-Southwell Rule for Fast Asynchronous
Decentralized Optimization
- URL: http://arxiv.org/abs/2207.07543v1
- Date: Fri, 15 Jul 2022 15:37:03 GMT
- Title: Pick your Neighbor: Local Gauss-Southwell Rule for Fast Asynchronous
Decentralized Optimization
- Authors: Marina Costantini, Nikolaos Liakopoulos, Panayotis Mertikopoulos,
Thrasyvoulos Spyropoulos
- Abstract summary: In decentralized optimization environments, each agent $i$ in a network of $n$ optimization nodes possesses a private function $f_i$.
We introduce an optimization-aware selection rule that chooses the neighbor with the highest dual cost improvement.
We show that the proposed set-wise GS rule achieves a speedup by a factor of up to the maximum degree in the network.
- Score: 37.85806148420504
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In decentralized optimization environments, each agent $i$ in a network of
$n$ optimization nodes possesses a private function $f_i$, and nodes
communicate with their neighbors to cooperatively minimize the aggregate
objective $\sum_{i=1}^n f_i$. In this setting, synchronizing the nodes' updates
incurs significant communication overhead and computational costs, so much of
the recent literature has focused on the analysis and design of asynchronous
optimization algorithms where agents activate and communicate at arbitrary
times, without requiring a global synchronization enforcer. Nonetheless, in
most of the work on the topic, active nodes select a neighbor to contact based
on a fixed probability (e.g., uniformly at random), a choice that ignores the
optimization landscape at the moment of activation. Instead, in this work we
introduce an optimization-aware selection rule that chooses the neighbor with
the highest dual cost improvement (a quantity related to a consensus-based
dualization of the problem at hand). This scheme is related to the coordinate
descent (CD) method with a Gauss-Southwell (GS) rule for coordinate updates; in
our setting however, only a subset of coordinates is accessible at each
iteration (because each node is constrained to communicate only with its direct
neighbors), so the existing literature on GS methods does not apply. To
overcome this difficulty, we develop a new analytical framework for smooth and
strongly convex $f_i$ that covers the class of set-wise CD algorithms -- a
class that directly applies to decentralized scenarios, but is not limited to
them -- and we show that the proposed set-wise GS rule achieves a speedup by a
factor of up to the maximum degree in the network (which is of the order of
$\Theta(n)$ in highly connected graphs). The speedup predicted by our
theoretical analysis is subsequently validated in numerical experiments with
synthetic data.
Related papers
- Fully First-Order Methods for Decentralized Bilevel Optimization [17.20330936572045]
This paper focuses on decentralized bilevel optimization (DSBO) where agents only communicate with their neighbors.
We propose Decentralized Gradient Descent and Ascent with Gradient Tracking (DSGDA-GT), a novel algorithm that only requires first-order oracles that are much cheaper than second-order oracles widely adopted in existing works.
arXiv Detail & Related papers (2024-10-25T06:11:43Z) - Online Optimization Perspective on First-Order and Zero-Order Decentralized Nonsmooth Nonconvex Stochastic Optimization [8.670873561640903]
We investigate the finite-time analysis of finding ($delta,,ilon$)-stationary points for nonsmooth nonsmooth objectives in decentralized settings.
$O is the first finite-time guarantee for decentralized nonsmooth optimization.
arXiv Detail & Related papers (2024-06-03T16:09:34Z) - Lower Bounds and Optimal Algorithms for Non-Smooth Convex Decentralized Optimization over Time-Varying Networks [57.24087627267086]
We consider the task of minimizing the sum of convex functions stored in a decentralized manner across the nodes of a communication network.
Lower bounds on the number of decentralized communications and (sub)gradient computations required to solve the problem have been established.
We develop the first optimal algorithm that matches these lower bounds and offers substantially improved theoretical performance compared to the existing state of the art.
arXiv Detail & Related papers (2024-05-28T10:28:45Z) - DASA: Delay-Adaptive Multi-Agent Stochastic Approximation [64.32538247395627]
We consider a setting in which $N$ agents aim to speedup a common Approximation problem by acting in parallel and communicating with a central server.
To mitigate the effect of delays and stragglers, we propose textttDASA, a Delay-Adaptive algorithm for multi-agent Approximation.
arXiv Detail & Related papers (2024-03-25T22:49:56Z) - Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization [108.35402316802765]
We propose a new first-order optimization algorithm -- AcceleratedGradient-OptimisticGradient (AG-OG) Ascent.
We show that AG-OG achieves the optimal convergence rate (up to a constant) for a variety of settings.
We extend our algorithm to extend the setting and achieve the optimal convergence rate in both bi-SC-SC and bi-C-SC settings.
arXiv Detail & Related papers (2022-10-31T17:59:29Z) - The Minimax Complexity of Distributed Optimization [0.0]
I present the "graph oracle model", an extension of the classic oracle framework that can be applied to distributed optimization.
I focus on the specific case of the "intermittent communication setting"
I analyze the theoretical properties of the popular Local Descent (SGD) algorithm in convex setting.
arXiv Detail & Related papers (2021-09-01T15:18:33Z) - Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex
Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network.
We design two optimal algorithms that attain these lower bounds.
We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z) - Decentralized Optimization with Heterogeneous Delays: a Continuous-Time
Approach [6.187780920448871]
We propose a novel continuous-time framework to analyze asynchronous algorithms.
We describe a fully asynchronous decentralized algorithm to minimize the sum of smooth and strongly convex functions.
arXiv Detail & Related papers (2021-06-07T13:09:25Z) - A hybrid variance-reduced method for decentralized stochastic non-convex
optimization [15.447966950703947]
textttGTHSGD algorithm specialized local hybrid gradient implements the network to track the global gradient.
textttGTHSGD achieves a network complexity of$O(n-1)$ when the required error tolerance$epsilon$ is small enough.
arXiv Detail & Related papers (2021-02-12T20:13:05Z) - The Min-Max Complexity of Distributed Stochastic Convex Optimization
with Intermittent Communication [61.60069865891783]
We resolve the min-max complexity of distributed convex optimization (up to a log factor) in the intermittent communication setting.
We present a novel lower bound with a matching upper bound that establishes an optimal algorithm.
arXiv Detail & Related papers (2021-02-02T16:18:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.