Related papers: Provably Doubly Accelerated Federated Learning: The First Theoretically Successful Combination of Local Training and Compressed Communication

Provably Doubly Accelerated Federated Learning: The First Theoretically Successful Combination of Local Training and Compressed Communication

URL: http://arxiv.org/abs/2210.13277v2
Date: Thu, 27 Oct 2022 17:20:27 GMT
Title: Provably Doubly Accelerated Federated Learning: The First Theoretically Successful Combination of Local Training and Compressed Communication
Authors: Laurent Condat, Ivan Agarsk\'y, Peter Richt\'arik
Abstract summary: We propose the first algorithm for distributed optimization and federated learning. Our algorithm converges linearly to an exact solution, with a doubly accelerated rate.
Score: 7.691755449724637
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the modern paradigm of federated learning, a large number of users are involved in a global learning task, in a collaborative way. They alternate local computations and two-way communication with a distant orchestrating server. Communication, which can be slow and costly, is the main bottleneck in this setting. To reduce the communication load and therefore accelerate distributed gradient descent, two strategies are popular: 1) communicate less frequently; that is, perform several iterations of local computations between the communication rounds; and 2) communicate compressed information instead of full-dimensional vectors. In this paper, we propose the first algorithm for distributed optimization and federated learning, which harnesses these two strategies jointly and converges linearly to an exact solution, with a doubly accelerated rate: our algorithm benefits from the two acceleration mechanisms provided by local training and compression, namely a better dependency on the condition number of the functions and on the dimension of the model, respectively.

Related papers

Accelerated Stochastic ExtraGradient: Mixing Hessian and Gradient Similarity to Reduce Communication in Distributed and Federated Learning [50.382793324572845]
Distributed computing involves communication between devices, which requires solving two key problems: efficiency and privacy. In this paper, we analyze a new method that incorporates the ideas of using data similarity and clients sampling. To address privacy concerns, we apply the technique of additional noise and analyze its impact on the convergence of the proposed method.
arXiv Detail & Related papers (2024-09-22T00:49:10Z)
FedComLoc: Communication-Efficient Distributed Training of Sparse and Quantized Models [56.21666819468249]
Federated Learning (FL) has garnered increasing attention due to its unique characteristic of allowing heterogeneous clients to process their private data locally and interact with a central server. We introduce FedComLoc, integrating practical and effective compression into emphScaffnew to further enhance communication efficiency.
arXiv Detail & Related papers (2024-03-14T22:29:59Z)
LoCoDL: Communication-Efficient Distributed Learning with Local Training and Compression [8.37672888329615]
We introduce LoCoDL, a communication-efficient algorithm that leverages the two popular and effective techniques of Local training, which reduces the communication frequency, and Compression, in which short bitstreams are sent instead of full-dimensional vectors of floats. LoCoDL provably benefits from local training and compression and enjoys a doubly-accelerated communication complexity, with respect to the condition number of the functions and the model dimension, in the general heterogenous regime with strongly convex functions.
arXiv Detail & Related papers (2024-03-07T09:22:50Z)
Communication-Efficient Decentralized Federated Learning via One-Bit Compressive Sensing [52.402550431781805]
Decentralized federated learning (DFL) has gained popularity due to its practicality across various applications. Compared to the centralized version, training a shared model among a large number of nodes in DFL is more challenging. We develop a novel algorithm based on the framework of the inexact alternating direction method (iADM)
arXiv Detail & Related papers (2023-08-31T12:22:40Z)
DIGEST: Fast and Communication Efficient Decentralized Learning with Local Updates [4.3707341422218215]
Two widely considered decentralized learning algorithms are Gossip and random walk-based learning. We design a fast and communication-efficient asynchronous decentralized learning mechanism DIGEST. We evaluate the performance of single- and multi-stream DIGEST for logistic regression and a deep neural network ResNet20.
arXiv Detail & Related papers (2023-07-14T22:58:20Z)
TAMUNA: Doubly Accelerated Distributed Optimization with Local Training, Compression, and Partial Participation [53.84175614198885]
In distributed optimization and learning, several machines alternate between local computations in parallel and communication with a distant server. We propose TAMUNA, the first algorithm for distributed optimization that leveraged the two strategies of local training and compression jointly and allows for partial participation.
arXiv Detail & Related papers (2023-02-20T08:37:44Z)
GradSkip: Communication-Accelerated Local Gradient Methods with Better Computational Complexity [54.585248253601314]
We study a class of distributed optimization algorithms that aim to alleviate high communication costs by allowing clients to perform multiple local gradient-type training steps before communication.<n>In particular, we prove that our modified method, GradSkip, converges linearly under the same assumptions and has the same accelerated communication complexity.
arXiv Detail & Related papers (2022-10-28T20:59:06Z)
Accelerating Federated Edge Learning via Optimized Probabilistic Device Scheduling [57.271494741212166]
This paper formulates and solves the communication time minimization problem. It is found that the optimized policy gradually turns its priority from suppressing the remaining communication rounds to reducing per-round latency as the training process evolves. The effectiveness of the proposed scheme is demonstrated via a use case on collaborative 3D objective detection in autonomous driving.
arXiv Detail & Related papers (2021-07-24T11:39:17Z)
A Linearly Convergent Algorithm for Decentralized Optimization: Sending Less Bits for Free! [72.31332210635524]
Decentralized optimization methods enable on-device training of machine learning models without a central coordinator. We propose a new randomized first-order method which tackles the communication bottleneck by applying randomized compression operators. We prove that our method can solve the problems without any increase in the number of communications compared to the baseline.
arXiv Detail & Related papers (2020-11-03T13:35:53Z)
Communication-Efficient Distributed Deep Learning: A Comprehensive Survey [22.42450750097714]
We provide a comprehensive survey of the communication-efficient distributed training algorithms. We first propose a taxonomy of data-parallel distributed training algorithms. We then investigate state-of-the-art studies that address problems in these four dimensions.
arXiv Detail & Related papers (2020-03-10T05:42:44Z)
Communication-Efficient Decentralized Learning with Sparsification and Adaptive Peer Selection [13.963329236804586]
We introduce a novel decentralized training algorithm with the following key features. Each worker only needs to communicate with a single peer at each communication round with a highly compressed model. Experimental results show that our algorithm significantly reduces the communication traffic and generally selects relatively high bandwidth peers.
arXiv Detail & Related papers (2020-02-22T12:31:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.