BiCoLoR: Communication-Efficient Optimization with Bidirectional Compression and Local Training
- URL: http://arxiv.org/abs/2601.12400v1
- Date: Sun, 18 Jan 2026 13:23:27 GMT
- Title: BiCoLoR: Communication-Efficient Optimization with Bidirectional Compression and Local Training
- Authors: Laurent Condat, Artavazd Maranjyan, Peter Richtárik,
- Abstract summary: BiCoLoR is a communication-efficient optimization algorithm that combines two widely used strategies: local training and compression.<n>We show BiCoLoR outperforms existing algorithms and establishes a new standard in communication efficiency.
- Score: 50.334494587223304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Slow and costly communication is often the main bottleneck in distributed optimization, especially in federated learning where it occurs over wireless networks. We introduce BiCoLoR, a communication-efficient optimization algorithm that combines two widely used and effective strategies: local training, which increases computation between communication rounds, and compression, which encodes high-dimensional vectors into short bitstreams. While these mechanisms have been combined before, compression has typically been applied only to uplink (client-to-server) communication, leaving the downlink (server-to-client) side unaddressed. In practice, however, both directions are costly. We propose BiCoLoR, the first algorithm to combine local training with bidirectional compression using arbitrary unbiased compressors. This joint design achieves accelerated complexity guarantees in both convex and strongly convex heterogeneous settings. Empirically, BiCoLoR outperforms existing algorithms and establishes a new standard in communication efficiency.
Related papers
- LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression [56.01900711954956]
We introduce LoCoDL, a communication-efficient algorithm that leverages the two popular and effective techniques of Local training, which reduces the communication frequency, and Compression, in which short bitstreams are sent instead of full-dimensional vectors of floats.<n>LoCoDL provably benefits from local training and compression and enjoys a doubly-accelerated communication complexity, with respect to the condition number of the functions and the model dimension, in the general heterogenous regime with strongly convex functions.
arXiv Detail & Related papers (2024-03-07T09:22:50Z) - Communication-Efficient Distributed Learning with Local Immediate Error
Compensation [95.6828475028581]
We propose the Local Immediate Error Compensated SGD (LIEC-SGD) optimization algorithm.
LIEC-SGD is superior to previous works in either the convergence rate or the communication cost.
arXiv Detail & Related papers (2024-02-19T05:59:09Z) - Improving the Worst-Case Bidirectional Communication Complexity for Nonconvex Distributed Optimization under Function Similarity [92.1840862558718]
We introduce MARINA-P, a novel method for downlink compression, employing a collection of correlated compressors.
We show that MARINA-P with permutation compressors can achieve a server-to-worker communication complexity improving with the number of workers.
We introduce M3, a method combining MARINA-P with uplink compression and a momentum step, achieving bidirectional compression with provable improvements in total communication complexity as the number of workers increases.
arXiv Detail & Related papers (2024-02-09T13:58:33Z) - TAMUNA: Doubly Accelerated Distributed Optimization with Local Training, Compression, and Partial Participation [53.84175614198885]
In distributed optimization and learning, several machines alternate between local computations in parallel and communication with a distant server.
We propose TAMUNA, the first algorithm for distributed optimization that leveraged the two strategies of local training and compression jointly and allows for partial participation.
arXiv Detail & Related papers (2023-02-20T08:37:44Z) - Provably Doubly Accelerated Federated Learning: The First Theoretically
Successful Combination of Local Training and Compressed Communication [7.691755449724637]
We propose the first algorithm for distributed optimization and federated learning.
Our algorithm converges linearly to an exact solution, with a doubly accelerated rate.
arXiv Detail & Related papers (2022-10-24T14:13:54Z) - 1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training
with LAMB's Convergence Speed [17.953619054149378]
We propose a new communication-efficient algorithm, 1-bit LAMB, which supports adaptive layerwise learning rates even when communication is compressed.
For BERT-Large pre-training task with batch sizes from 8K to 64K, our evaluations demonstrate that 1-bit LAMB with NCCL-based backend is able to achieve up to 4.6x communication volume reduction.
arXiv Detail & Related papers (2021-04-13T10:07:49Z) - PowerGossip: Practical Low-Rank Communication Compression in
Decentralized Deep Learning [62.440827696638664]
We introduce a simple algorithm that directly compresses the model differences between neighboring workers.
Inspired by the PowerSGD for centralized deep learning, this algorithm uses power steps to maximize the information transferred per bit.
arXiv Detail & Related papers (2020-08-04T09:14:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.