Federated Learning with Compression: Unified Analysis and Sharp
Guarantees
- URL: http://arxiv.org/abs/2007.01154v2
- Date: Sat, 21 Nov 2020 04:33:08 GMT
- Title: Federated Learning with Compression: Unified Analysis and Sharp
Guarantees
- Authors: Farzin Haddadpour, Mohammad Mahdi Kamani, Aryan Mokhtari and Mehrdad
Mahdavi
- Abstract summary: Communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices.
Two notable trends to deal with the communication overhead of federated compression and computation are unreliable compression and heterogeneous communication.
We analyze their convergence in both homogeneous and heterogeneous data distribution settings.
- Score: 39.092596142018195
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: In federated learning, communication cost is often a critical bottleneck to
scale up distributed optimization algorithms to collaboratively learn a model
from millions of devices with potentially unreliable or limited communication
and heterogeneous data distributions. Two notable trends to deal with the
communication overhead of federated algorithms are gradient compression and
local computation with periodic communication. Despite many attempts,
characterizing the relationship between these two approaches has proven
elusive. We address this by proposing a set of algorithms with periodical
compressed (quantized or sparsified) communication and analyze their
convergence properties in both homogeneous and heterogeneous local data
distribution settings. For the homogeneous setting, our analysis improves
existing bounds by providing tighter convergence rates for both strongly convex
and non-convex objective functions. To mitigate data heterogeneity, we
introduce a local gradient tracking scheme and obtain sharp convergence rates
that match the best-known communication complexities without compression for
convex, strongly convex, and nonconvex settings. We complement our theoretical
results and demonstrate the effectiveness of our proposed methods by several
experiments on real-world datasets.
Related papers
- Non-convex composite federated learning with heterogeneous data [10.14896454396227]
We propose an innovative algorithm for non-linear composite learning that decouples the proximal operator evaluation and the communication between server and client.
We demonstrate the superiority our algorithm over state-of-the-art methods both synthetic and real datasets.
arXiv Detail & Related papers (2025-02-06T10:49:03Z) - Accelerated Methods with Compressed Communications for Distributed Optimization Problems under Data Similarity [55.03958223190181]
We propose the first theoretically grounded accelerated algorithms utilizing unbiased and biased compression under data similarity.
Our results are of record and confirmed by experiments on different average losses and datasets.
arXiv Detail & Related papers (2024-12-21T00:40:58Z) - Non-Convex Optimization in Federated Learning via Variance Reduction and Adaptive Learning [13.83895180419626]
This paper proposes a novel algorithm that leverages momentum-based variance reduction with adaptive learning to address non-epsilon settings across heterogeneous data.
We aim to overcome challenges related to variance, hinders efficiency, and the slow convergence from learning rate adjustments with heterogeneous data.
arXiv Detail & Related papers (2024-12-16T11:02:38Z) - Analysis of regularized federated learning [8.489782750973005]
Federated learning is an efficient tool for dealing with heterogeneous big data and privacy protection.
Loop descent is often used for such methods on implementing big data, to reduce communication costs.
arXiv Detail & Related papers (2024-11-03T12:47:54Z) - Distributed Event-Based Learning via ADMM [11.461617927469316]
We consider a global distributed learning problem, where agents minimize an objective function by exchanging information over a network.
Our approach has two distinct settings: (i) It substantially reduces communication by triggering communication only when necessary, and (ii) it is to the data-distribution among the different agents.
arXiv Detail & Related papers (2024-05-17T08:30:28Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes.
We propose a metric that quantifies the ability of a graph to mix the current gradients.
Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z) - Resource-constrained Federated Edge Learning with Heterogeneous Data:
Formulation and Analysis [8.863089484787835]
We propose a distributed approximate Newton-type Newton-type training scheme, namely FedOVA, to solve the heterogeneous statistical challenge brought by heterogeneous data.
FedOVA decomposes a multi-class classification problem into more straightforward binary classification problems and then combines their respective outputs using ensemble learning.
arXiv Detail & Related papers (2021-10-14T17:35:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.