Related papers: Exploiting Similarity for Computation and Communication-Efficient Decentralized Optimization

Exploiting Similarity for Computation and Communication-Efficient Decentralized Optimization

URL: http://arxiv.org/abs/2506.05791v1
Date: Fri, 06 Jun 2025 06:37:10 GMT
Title: Exploiting Similarity for Computation and Communication-Efficient Decentralized Optimization
Authors: Yuki Takezawa, Xiaowen Jiang, Anton Rodomanov, Sebastian U. Stich,
Abstract summary: We propose the Stabilized Proximal Decentralized Optimization (SPDO) method, which achieves state-of-the-art communication and computational complexities.<n> Experimental results demonstrate that SPDO significantly outperforms existing methods.
Score: 19.81438528494734
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reducing communication complexity is critical for efficient decentralized optimization. The proximal decentralized optimization (PDO) framework is particularly appealing, as methods within this framework can exploit functional similarity among nodes to reduce communication rounds. Specifically, when local functions at different nodes are similar, these methods achieve faster convergence with fewer communication steps. However, existing PDO methods often require highly accurate solutions to subproblems associated with the proximal operator, resulting in significant computational overhead. In this work, we propose the Stabilized Proximal Decentralized Optimization (SPDO) method, which achieves state-of-the-art communication and computational complexities within the PDO framework. Additionally, we refine the analysis of existing PDO methods by relaxing subproblem accuracy requirements and leveraging average functional similarity. Experimental results demonstrate that SPDO significantly outperforms existing methods.

Related papers

Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata. DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase. Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z)
Stabilized Proximal-Point Methods for Federated Optimization [20.30761752651984]
Best-known communication complexity among non-accelerated algorithms is achieved by DANE, a distributed proximal-point algorithm. Inspired by the hybrid-projection proximal-point method, we propose a novel distributed algorithm S-DANE. We show that S-DANE achieves the best-known communication complexity while still enjoying good local computation efficiency as S-DANE.
arXiv Detail & Related papers (2024-07-09T17:56:29Z)
Lower Bounds and Optimal Algorithms for Non-Smooth Convex Decentralized Optimization over Time-Varying Networks [57.24087627267086]
We consider the task of minimizing the sum of convex functions stored in a decentralized manner across the nodes of a communication network. Lower bounds on the number of decentralized communications and (sub)gradient computations required to solve the problem have been established. We develop the first optimal algorithm that matches these lower bounds and offers substantially improved theoretical performance compared to the existing state of the art.
arXiv Detail & Related papers (2024-05-28T10:28:45Z)
Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD for Communication Efficient Nonconvex Distributed Learning [58.79085525115987]
Local methods are one of the promising approaches to reduce communication time. We show that the communication complexity is better than non-local methods when the local datasets is smaller than the smoothness local loss.
arXiv Detail & Related papers (2022-02-12T15:12:17Z)
Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes. An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps. This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z)
DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization [43.31016937305845]
Internet-of-things, networked sensing, autonomous systems and federated learning call for decentralized algorithms for finite-sum optimizations. We develop DEcentralized STochastic REcurSive methodDESTRESS for non finite-sum optimization. Detailed theoretical and numerical comparisons show that DESTRESS improves upon prior decentralized algorithms.
arXiv Detail & Related papers (2021-10-04T03:17:41Z)
Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints. We develop novel and simple optimization formulations for these problems. As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z)
The Minimax Complexity of Distributed Optimization [0.0]
I present the "graph oracle model", an extension of the classic oracle framework that can be applied to distributed optimization. I focus on the specific case of the "intermittent communication setting" I analyze the theoretical properties of the popular Local Descent (SGD) algorithm in convex setting.
arXiv Detail & Related papers (2021-09-01T15:18:33Z)
Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network. We design two optimal algorithms that attain these lower bounds. We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z)
Distributed Optimization, Averaging via ADMM, and Network Topology [0.0]
We study the connection between network topology and convergence rates for different algorithms on a real world problem of sensor localization. We also show interesting connections between ADMM and lifted Markov chains besides providing an explicitly characterization of its convergence.
arXiv Detail & Related papers (2020-09-05T21:44:39Z)
FedSplit: An algorithmic framework for fast federated optimization [40.42352500741025]
We introduce FedSplit, a class of algorithms for solving distributed convex minimization with additive structure. Our theory shows that these methods are provably robust to inexact computation of intermediate local quantities.
arXiv Detail & Related papers (2020-05-11T16:30:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.