Related papers: Optimizing the Communication-Accuracy Trade-off in Federated Learning with Rate-Distortion Theory

Optimizing the Communication-Accuracy Trade-off in Federated Learning with Rate-Distortion Theory

URL: http://arxiv.org/abs/2201.02664v1
Date: Fri, 7 Jan 2022 20:17:33 GMT
Title: Optimizing the Communication-Accuracy Trade-off in Federated Learning with Rate-Distortion Theory
Authors: Nicole Mitchell, Johannes Ball\'e, Zachary Charles, Jakub Kone\v{c}n\'y
Abstract summary: A significant bottleneck in federated learning is the network communication cost of sending model updates from client devices to the central server. Our method encodes quantized updates with an appropriate universal code, taking into account their empirical distribution. Because quantization introduces error, we select quantization levels by optimizing for the desired trade-off in average total gradient and distortion.
Score: 1.5771347525430772
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A significant bottleneck in federated learning is the network communication cost of sending model updates from client devices to the central server. We propose a method to reduce this cost. Our method encodes quantized updates with an appropriate universal code, taking into account their empirical distribution. Because quantization introduces error, we select quantization levels by optimizing for the desired trade-off in average total bitrate and gradient distortion. We demonstrate empirically that in spite of the non-i.i.d. nature of federated learning, the rate-distortion frontier is consistent across datasets, optimizers, clients and training rounds, and within each setting, distortion reliably predicts model performance. This allows for a remarkably simple compression scheme that is near-optimal in many use cases, and outperforms Top-K, DRIVE, 3LC and QSGD on the Stack Overflow next-word prediction benchmark.

Related papers

Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
Over-the-Air Fair Federated Learning via Multi-Objective Optimization [52.295563400314094]
We propose an over-the-air fair federated learning algorithm (OTA-FFL) to train fair FL models. Experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance.
arXiv Detail & Related papers (2025-01-06T21:16:51Z)
Fed-CVLC: Compressing Federated Learning Communications with Variable-Length Codes [54.18186259484828]
In Federated Learning (FL) paradigm, a parameter server (PS) concurrently communicates with distributed participating clients for model collection, update aggregation, and model distribution over multiple rounds. We show strong evidences that variable-length is beneficial for compression in FL. We present Fed-CVLC (Federated Learning Compression with Variable-Length Codes), which fine-tunes the code length in response to the dynamics of model updates.
arXiv Detail & Related papers (2024-02-06T07:25:21Z)
FedShift: Robust Federated Learning Aggregation Scheme in Resource Constrained Environment via Weight Shifting [5.680416078423551]
Federated Learning (FL) commonly relies on a central server to coordinate training across distributed clients. Clients may employ different quantization levels based on their hardware or network constraints, necessitating a mixed-precision aggregation process at the server. We propose FedShift, a novel aggregation methodology designed to mitigate performance degradation in FL scenarios with mixed quantization levels.
arXiv Detail & Related papers (2024-02-02T00:03:51Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
Quantize Once, Train Fast: Allreduce-Compatible Compression with Provable Guarantees [53.950234267704]
We introduce Global-QSGD, an All-reduce gradient-compatible quantization method.<n>We show that it accelerates distributed training by up to 3.51% over baseline quantization methods.
arXiv Detail & Related papers (2023-05-29T21:32:15Z)
TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels [141.29156234353133]
State-of-the-art convex learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions. We show this disparity can largely be attributed to challenges presented by non-NISTity. We propose a Train-Convexify neural network (TCT) procedure to sidestep this issue.
arXiv Detail & Related papers (2022-07-13T16:58:22Z)
AdaBest: Minimizing Client Drift in Federated Learning via Adaptive Bias Estimation [12.62716075696359]
In Federated Learning (FL), a number of clients or devices collaborate to train a model without sharing their data. In order to estimate and therefore remove this drift, variance reduction techniques have been incorporated into FL optimization recently. We propose an adaptive algorithm that accurately estimates drift across clients.
arXiv Detail & Related papers (2022-04-27T20:04:24Z)
Optimal Rate Adaption in Federated Learning with Compressed Communications [28.16239232265479]
Federated Learning incurs high communication overhead, which can be greatly alleviated by compression for model updates. tradeoff between compression and model accuracy in the networked environment remains unclear. We present a framework to maximize the final model accuracy by strategically adjusting the compression each iteration.
arXiv Detail & Related papers (2021-12-13T14:26:15Z)
Communication-Compressed Adaptive Gradient Method for Distributed Nonconvex Optimization [21.81192774458227]
One of the major bottlenecks is the large communication cost between the central server and the local workers. Our proposed distributed learning framework features an effective gradient gradient compression strategy.
arXiv Detail & Related papers (2021-11-01T04:54:55Z)
FedDQ: Communication-Efficient Federated Learning with Descending Quantization [5.881154276623056]
Federated learning (FL) is an emerging privacy-preserving distributed learning scheme. FL suffers from critical communication bottleneck due to large model size and frequent model aggregation. This paper proposes an opposite approach to do adaptive quantization.
arXiv Detail & Related papers (2021-10-05T18:56:28Z)
Adaptive Quantization of Model Updates for Communication-Efficient Federated Learning [75.45968495410047]
Communication of model updates between client nodes and the central aggregating server is a major bottleneck in federated learning. Gradient quantization is an effective way of reducing the number of bits required to communicate each model update. We propose an adaptive quantization strategy called AdaFL that aims to achieve communication efficiency as well as a low error floor.
arXiv Detail & Related papers (2021-02-08T19:14:21Z)
Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm. Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z)
APQ: Joint Search for Network Architecture, Pruning and Quantization Policy [49.3037538647714]
We present APQ for efficient deep learning inference on resource-constrained hardware. Unlike previous methods that separately search the neural architecture, pruning policy, and quantization policy, we optimize them in a joint manner. With the same accuracy, APQ reduces the latency/energy by 2x/1.3x over MobileNetV2+HAQ.
arXiv Detail & Related papers (2020-06-15T16:09:17Z)
UVeQFed: Universal Vector Quantization for Federated Learning [179.06583469293386]
Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data. In FL, each user trains its copy of the learning model locally. The server then collects the individual updates and aggregates them into a global model. We show that combining universal vector quantization methods with FL yields a decentralized training system in which the compression of the trained models induces only a minimum distortion.
arXiv Detail & Related papers (2020-06-05T07:10:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.