Federated Learning with Compression: Unified Analysis and Sharp
Guarantees
- URL: http://arxiv.org/abs/2007.01154v2
- Date: Sat, 21 Nov 2020 04:33:08 GMT
- Title: Federated Learning with Compression: Unified Analysis and Sharp
Guarantees
- Authors: Farzin Haddadpour, Mohammad Mahdi Kamani, Aryan Mokhtari and Mehrdad
Mahdavi
- Abstract summary: Communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices.
Two notable trends to deal with the communication overhead of federated compression and computation are unreliable compression and heterogeneous communication.
We analyze their convergence in both homogeneous and heterogeneous data distribution settings.
- Score: 39.092596142018195
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In federated learning, communication cost is often a critical bottleneck to
scale up distributed optimization algorithms to collaboratively learn a model
from millions of devices with potentially unreliable or limited communication
and heterogeneous data distributions. Two notable trends to deal with the
communication overhead of federated algorithms are gradient compression and
local computation with periodic communication. Despite many attempts,
characterizing the relationship between these two approaches has proven
elusive. We address this by proposing a set of algorithms with periodical
compressed (quantized or sparsified) communication and analyze their
convergence properties in both homogeneous and heterogeneous local data
distribution settings. For the homogeneous setting, our analysis improves
existing bounds by providing tighter convergence rates for both strongly convex
and non-convex objective functions. To mitigate data heterogeneity, we
introduce a local gradient tracking scheme and obtain sharp convergence rates
that match the best-known communication complexities without compression for
convex, strongly convex, and nonconvex settings. We complement our theoretical
results and demonstrate the effectiveness of our proposed methods by several
experiments on real-world datasets.
Related papers
- Analysis of regularized federated learning [8.489782750973005]
Federated learning is an efficient tool for dealing with heterogeneous big data and privacy protection.
Loop descent is often used for such methods on implementing big data, to reduce communication costs.
arXiv Detail & Related papers (2024-11-03T12:47:54Z) - Distributed Event-Based Learning via ADMM [11.461617927469316]
We consider a distributed learning problem, where agents minimize a global objective function by exchanging information over a network.
Our approach has two distinct features: (i) It substantially reduces communication by triggering communication only when necessary, and (ii) it is agnostic to the data-distribution among the different agents.
arXiv Detail & Related papers (2024-05-17T08:30:28Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - On the Communication Complexity of Decentralized Bilevel Optimization [40.45379954138305]
We propose two novel decentralized bilevel gradient descent algorithms based on simultaneous and alternating update strategies.
Our algorithms can achieve faster convergence rates and lower communication costs than existing methods.
This is the first time such favorable theoretical results have been achieved with mild assumptions in the heterogeneous setting.
arXiv Detail & Related papers (2023-11-19T14:56:26Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - Distributed Newton-Type Methods with Communication Compression and
Bernoulli Aggregation [11.870393751095083]
We study ommunication compression and aggregation mechanisms for curvature information.
New 3PC mechanisms, such as adaptive thresholding and Bernoulli aggregation, require reduced communication and occasional Hessian computations.
For all our methods, we derive fast condition-number-independent local linear and/or superlinear convergence rates.
arXiv Detail & Related papers (2022-06-07T21:12:21Z) - Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes.
We propose a metric that quantifies the ability of a graph to mix the current gradients.
Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z) - Resource-constrained Federated Edge Learning with Heterogeneous Data:
Formulation and Analysis [8.863089484787835]
We propose a distributed approximate Newton-type Newton-type training scheme, namely FedOVA, to solve the heterogeneous statistical challenge brought by heterogeneous data.
FedOVA decomposes a multi-class classification problem into more straightforward binary classification problems and then combines their respective outputs using ensemble learning.
arXiv Detail & Related papers (2021-10-14T17:35:24Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.