Related papers: Federated Learning with Compression: Unified Analysis and Sharp Guarantees

Federated Learning with Compression: Unified Analysis and Sharp Guarantees

URL: http://arxiv.org/abs/2007.01154v2
Date: Sat, 21 Nov 2020 04:33:08 GMT
Title: Federated Learning with Compression: Unified Analysis and Sharp Guarantees
Authors: Farzin Haddadpour, Mohammad Mahdi Kamani, Aryan Mokhtari and Mehrdad Mahdavi
Abstract summary: Communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices. Two notable trends to deal with the communication overhead of federated compression and computation are unreliable compression and heterogeneous communication. We analyze their convergence in both homogeneous and heterogeneous data distribution settings.
Score: 39.092596142018195
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In federated learning, communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices with potentially unreliable or limited communication and heterogeneous data distributions. Two notable trends to deal with the communication overhead of federated algorithms are gradient compression and local computation with periodic communication. Despite many attempts, characterizing the relationship between these two approaches has proven elusive. We address this by proposing a set of algorithms with periodical compressed (quantized or sparsified) communication and analyze their convergence properties in both homogeneous and heterogeneous local data distribution settings. For the homogeneous setting, our analysis improves existing bounds by providing tighter convergence rates for both strongly convex and non-convex objective functions. To mitigate data heterogeneity, we introduce a local gradient tracking scheme and obtain sharp convergence rates that match the best-known communication complexities without compression for convex, strongly convex, and nonconvex settings. We complement our theoretical results and demonstrate the effectiveness of our proposed methods by several experiments on real-world datasets.

Related papers

Non-convex composite federated learning with heterogeneous data [10.14896454396227]
We propose an innovative algorithm for non-linear composite learning that decouples the proximal operator evaluation and the communication between server and client. We demonstrate the superiority our algorithm over state-of-the-art methods both synthetic and real datasets.
arXiv Detail & Related papers (2025-02-06T10:49:03Z)
Accelerated Methods with Compressed Communications for Distributed Optimization Problems under Data Similarity [55.03958223190181]
We propose the first theoretically grounded accelerated algorithms utilizing unbiased and biased compression under data similarity. Our results are of record and confirmed by experiments on different average losses and datasets.
arXiv Detail & Related papers (2024-12-21T00:40:58Z)
Non-Convex Optimization in Federated Learning via Variance Reduction and Adaptive Learning [13.83895180419626]
This paper proposes a novel algorithm that leverages momentum-based variance reduction with adaptive learning to address non-epsilon settings across heterogeneous data. We aim to overcome challenges related to variance, hinders efficiency, and the slow convergence from learning rate adjustments with heterogeneous data.
arXiv Detail & Related papers (2024-12-16T11:02:38Z)
Analysis of regularized federated learning [8.489782750973005]
Federated learning is an efficient tool for dealing with heterogeneous big data and privacy protection. Loop descent is often used for such methods on implementing big data, to reduce communication costs.
arXiv Detail & Related papers (2024-11-03T12:47:54Z)
Distributed Event-Based Learning via ADMM [11.461617927469316]
We consider a distributed learning problem, where agents minimize a global objective function by exchanging information over a network. Our approach has two distinct features: (i) It substantially reduces communication by triggering communication only when necessary, and (ii) it is agnostic to the data-distribution among the different agents.
arXiv Detail & Related papers (2024-05-17T08:30:28Z)
Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data. Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z)
A Hierarchical Federated Learning Approach for the Internet of Things [3.28418927821443]
We present a novel federated learning solution, QHetFed, suitable for large-scale Internet of Things deployments. We show that QHetFed consistently achieves high learning accuracy and significantly outperforms other hierarchical algorithms.
arXiv Detail & Related papers (2024-03-03T15:40:24Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
On the Communication Complexity of Decentralized Bilevel Optimization [40.45379954138305]
We propose two novel decentralized bilevel gradient descent algorithms based on simultaneous and alternating update strategies. Our algorithms can achieve faster convergence rates and lower communication costs than existing methods. This is the first time such favorable theoretical results have been achieved with mild assumptions in the heterogeneous setting.
arXiv Detail & Related papers (2023-11-19T14:56:26Z)
Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp) We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings. For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error. In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z)
Distributed Newton-Type Methods with Communication Compression and Bernoulli Aggregation [11.870393751095083]
We study ommunication compression and aggregation mechanisms for curvature information. New 3PC mechanisms, such as adaptive thresholding and Bernoulli aggregation, require reduced communication and occasional Hessian computations. For all our methods, we derive fast condition-number-independent local linear and/or superlinear convergence rates.
arXiv Detail & Related papers (2022-06-07T21:12:21Z)
Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes. We propose a metric that quantifies the ability of a graph to mix the current gradients. Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z)
Resource-constrained Federated Edge Learning with Heterogeneous Data: Formulation and Analysis [8.863089484787835]
We propose a distributed approximate Newton-type Newton-type training scheme, namely FedOVA, to solve the heterogeneous statistical challenge brought by heterogeneous data. FedOVA decomposes a multi-class classification problem into more straightforward binary classification problems and then combines their respective outputs using ensemble learning.
arXiv Detail & Related papers (2021-10-14T17:35:24Z)
Learning while Respecting Privacy and Robustness to Distributional Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model. The objective is to endow the trained model with robustness against adversarially manipulated input data. Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.