DESTRESS: Computation-Optimal and Communication-Efficient Decentralized
Nonconvex Finite-Sum Optimization
- URL: http://arxiv.org/abs/2110.01165v1
- Date: Mon, 4 Oct 2021 03:17:41 GMT
- Title: DESTRESS: Computation-Optimal and Communication-Efficient Decentralized
Nonconvex Finite-Sum Optimization
- Authors: Boyue Li, Zhize Li, Yuejie Chi
- Abstract summary: Internet-of-things, networked sensing, autonomous systems and federated learning call for decentralized algorithms for finite-sum optimizations.
We develop DEcentralized STochastic REcurSive methodDESTRESS for non finite-sum optimization.
Detailed theoretical and numerical comparisons show that DESTRESS improves upon prior decentralized algorithms.
- Score: 43.31016937305845
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Emerging applications in multi-agent environments such as internet-of-things,
networked sensing, autonomous systems and federated learning, call for
decentralized algorithms for finite-sum optimizations that are
resource-efficient in terms of both computation and communication. In this
paper, we consider the prototypical setting where the agents work
collaboratively to minimize the sum of local loss functions by only
communicating with their neighbors over a predetermined network topology. We
develop a new algorithm, called DEcentralized STochastic REcurSive gradient
methodS (DESTRESS) for nonconvex finite-sum optimization, which matches the
optimal incremental first-order oracle (IFO) complexity of centralized
algorithms for finding first-order stationary points, while maintaining
communication efficiency. Detailed theoretical and numerical comparisons
corroborate that the resource efficiencies of DESTRESS improve upon prior
decentralized algorithms over a wide range of parameter regimes. DESTRESS
leverages several key algorithm design ideas including stochastic recursive
gradient updates with mini-batches for local computation, gradient tracking
with extra mixing (i.e., multiple gossiping rounds) for per-iteration
communication, together with careful choices of hyper-parameters and new
analysis frameworks to provably achieve a desirable computation-communication
trade-off.
Related papers
- Lower Bounds and Optimal Algorithms for Non-Smooth Convex Decentralized Optimization over Time-Varying Networks [57.24087627267086]
We consider the task of minimizing the sum of convex functions stored in a decentralized manner across the nodes of a communication network.
Lower bounds on the number of decentralized communications and (sub)gradient computations required to solve the problem have been established.
We develop the first optimal algorithm that matches these lower bounds and offers substantially improved theoretical performance compared to the existing state of the art.
arXiv Detail & Related papers (2024-05-28T10:28:45Z) - Federated Conditional Stochastic Optimization [110.513884892319]
Conditional optimization has found in a wide range of machine learning tasks, such as in-variant learning tasks, AUPRC, andAML.
This paper proposes algorithms for distributed federated learning.
arXiv Detail & Related papers (2023-10-04T01:47:37Z) - Q-SHED: Distributed Optimization at the Edge via Hessian Eigenvectors
Quantization [5.404315085380945]
Newton-type (NT) methods have been advocated as enablers of robust convergence rates in DO problems.
We propose Q-SHED, an original NT algorithm for DO featuring a novel bit-allocation scheme based on incremental Hessian eigenvectors quantization.
We show that Q-SHED can reduce by up to 60% the number of communication rounds required for convergence.
arXiv Detail & Related papers (2023-05-18T10:15:03Z) - Optimal Complexity in Non-Convex Decentralized Learning over
Time-Varying Networks [8.860889476382594]
Decentralized optimization with time-varying networks is an emerging paradigm in machine learning.
It saves remarkable communication overhead in large-scale deep training and is more robust in wireless scenarios especially when nodes are moving.
arXiv Detail & Related papers (2022-11-01T15:37:54Z) - Asynchronous Distributed Reinforcement Learning for LQR Control via Zeroth-Order Block Coordinate Descent [7.6860514640178]
We propose a novel zeroth-order optimization algorithm for distributed reinforcement learning.
It allows each agent to estimate its local gradient by cost evaluation independently, without use of any consensus protocol.
arXiv Detail & Related papers (2021-07-26T18:11:07Z) - Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex
Decentralized Optimization Over Time-Varying Networks [79.16773494166644]
We consider the task of minimizing the sum of smooth and strongly convex functions stored in a decentralized manner across the nodes of a communication network.
We design two optimal algorithms that attain these lower bounds.
We corroborate the theoretical efficiency of these algorithms by performing an experimental comparison with existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-08T15:54:44Z) - Resource Allocation via Model-Free Deep Learning in Free Space Optical
Communications [119.81868223344173]
The paper investigates the general problem of resource allocation for mitigating channel fading effects in Free Space Optical (FSO) communications.
Under this framework, we propose two algorithms that solve FSO resource allocation problems.
arXiv Detail & Related papers (2020-07-27T17:38:51Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.