Related papers: Lower Bounds and Accelerated Algorithms in Distributed Stochastic Optimization with Communication Compression

Lower Bounds and Accelerated Algorithms in Distributed Stochastic Optimization with Communication Compression

URL: http://arxiv.org/abs/2305.07612v1
Date: Fri, 12 May 2023 17:02:43 GMT
Title: Lower Bounds and Accelerated Algorithms in Distributed Stochastic Optimization with Communication Compression
Authors: Yutong He, Xinmeng Huang, Yiming Chen, Wotao Yin, Kun Yuan
Abstract summary: Communication compression is an essential strategy for alleviating communication overhead. We propose NEOLITHIC, a nearly optimal algorithm for compression under mild conditions.
Score: 31.107056382542417
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Communication compression is an essential strategy for alleviating communication overhead by reducing the volume of information exchanged between computing nodes in large-scale distributed stochastic optimization. Although numerous algorithms with convergence guarantees have been obtained, the optimal performance limit under communication compression remains unclear. In this paper, we investigate the performance limit of distributed stochastic optimization algorithms employing communication compression. We focus on two main types of compressors, unbiased and contractive, and address the best-possible convergence rates one can obtain with these compressors. We establish the lower bounds for the convergence rates of distributed stochastic optimization in six different settings, combining strongly-convex, generally-convex, or non-convex functions with unbiased or contractive compressor types. To bridge the gap between lower bounds and existing algorithms' rates, we propose NEOLITHIC, a nearly optimal algorithm with compression that achieves the established lower bounds up to logarithmic factors under mild conditions. Extensive experimental results support our theoretical findings. This work provides insights into the theoretical limitations of existing compressors and motivates further research into fundamentally new compressor properties.

Related papers

Accelerated Distributed Optimization with Compression and Error Feedback [22.94016026311574]
ADEF integrates Nesterov acceleration, contractive compression, error feedback, and gradient difference compression. We prove that ADEF achieves the first accelerated convergence rate for distributed optimization with contractive compression.
arXiv Detail & Related papers (2025-03-11T13:40:34Z)
Differential error feedback for communication-efficient decentralized learning [48.924131251745266]
We propose a new decentralized communication-efficient learning approach that blends differential quantization with error feedback. We show that the resulting communication-efficient strategy is stable both in terms of mean-square error and average bit rate. The results establish that, in the small step-size regime and with a finite number of bits, it is possible to attain the performance achievable in the absence of compression.
arXiv Detail & Related papers (2024-06-26T15:11:26Z)
Improving the Worst-Case Bidirectional Communication Complexity for Nonconvex Distributed Optimization under Function Similarity [92.1840862558718]
We introduce MARINA-P, a novel method for downlink compression, employing a collection of correlated compressors. We show that MARINA-P with permutation compressors can achieve a server-to-worker communication complexity improving with the number of workers. We introduce M3, a method combining MARINA-P with uplink compression and a momentum step, achieving bidirectional compression with provable improvements in total communication complexity as the number of workers increases.
arXiv Detail & Related papers (2024-02-09T13:58:33Z)
Learning Accurate Performance Predictors for Ultrafast Automated Model Compression [86.22294249097203]
We propose an ultrafast automated model compression framework called SeerNet for flexible network deployment. Our method achieves competitive accuracy-complexity trade-offs with significant reduction of the search cost.
arXiv Detail & Related papers (2023-04-13T10:52:49Z)
Lower Bounds and Nearly Optimal Algorithms in Distributed Learning with Communication Compression [33.217552987061474]
Communication compression is one of the most effective means of reducing communication. Recent advances in distributed optimization and learning have shown that communication compression is one of the most effective means of reducing communication.
arXiv Detail & Related papers (2022-06-08T03:36:34Z)
EF-BV: A Unified Theory of Error Feedback and Variance Reduction Mechanisms for Biased and Unbiased Compression in Distributed Optimization [7.691755449724637]
In distributed or federated optimization and learning, communication between the different computing units is often the bottleneck. There are two classes of compression operators and separate algorithms making use of them. We propose a new algorithm, recovering DIANA and EF21 as particular cases.
arXiv Detail & Related papers (2022-05-09T10:44:23Z)
On Arbitrary Compression for Decentralized Consensus and Stochastic Optimization over Directed Networks [0.6526824510982799]
We propose an iterative-based algorithm that compresses messages according to a desired compression ratio. Contrary to existing literature allow arbitrary compression ratios. We show explicit convergence rates for decentralized optimization problems on smooth functions.
arXiv Detail & Related papers (2022-04-18T04:41:56Z)
Decentralized Composite Optimization with Compression [36.75785129001134]
We study the decentralized composite optimization problem with a potentially non-smooth component. A convergent underlineDecentralized algorithm with compression, Prox-LEAD, is proposed. Our theorems indicate that Prox-LEAD works with arbitrary compression precision, and it tremendously reduces the communication cost almost for free.
arXiv Detail & Related papers (2021-08-10T04:54:52Z)
Compressing Neural Networks: Towards Determining the Optimal Layer-wise Decomposition [62.41259783906452]
We present a novel global compression framework for deep neural networks. It automatically analyzes each layer to identify the optimal per-layer compression ratio. Our results open up new avenues for future research into the global performance-size trade-offs of modern neural networks.
arXiv Detail & Related papers (2021-07-23T20:01:30Z)
Innovation Compression for Communication-efficient Distributed Optimization with Linear Convergence [23.849813231750932]
This paper proposes a communication-efficient linearly convergent distributed (COLD) algorithm to solve strongly convex optimization problems. By compressing innovation vectors, COLD is able to achieve linear convergence for a class of $delta$-contracted compressors. Numerical experiments demonstrate the advantages of both algorithms under different compressors.
arXiv Detail & Related papers (2021-05-14T08:15:18Z)
PowerGossip: Practical Low-Rank Communication Compression in Decentralized Deep Learning [62.440827696638664]
We introduce a simple algorithm that directly compresses the model differences between neighboring workers. Inspired by the PowerSGD for centralized deep learning, this algorithm uses power steps to maximize the information transferred per bit.
arXiv Detail & Related papers (2020-08-04T09:14:52Z)
Linear Convergent Decentralized Optimization with Compression [50.44269451541387]
Existing decentralized algorithms with compression mainly focus on compressing DGD-type algorithms. Motivated by primal-dual algorithms, this paper proposes first underlineLinunderlineEAr convergent. underlineDecentralized with compression, LEAD.
arXiv Detail & Related papers (2020-07-01T04:35:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.