Achieving Linear Convergence in Federated Learning under Objective and
Systems Heterogeneity
- URL: http://arxiv.org/abs/2102.07053v1
- Date: Sun, 14 Feb 2021 02:47:35 GMT
- Title: Achieving Linear Convergence in Federated Learning under Objective and
Systems Heterogeneity
- Authors: Aritra Mitra, Rayana Jaafar, George J. Pappas, and Hamed Hassani
- Abstract summary: FedLin is a simple, new algorithm that exploits past gradients and employs client-specific learning rates.
We show that FedLin is the only approach that is able to match centralized convergence rates (up constants) for smooth convex, convex, and non-linear loss functions.
- Score: 24.95640915217946
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider a standard federated learning architecture where a group of
clients periodically coordinate with a central server to train a statistical
model. We tackle two major challenges in federated learning: (i) objective
heterogeneity, which stems from differences in the clients' local loss
functions, and (ii) systems heterogeneity, which leads to slow and straggling
client devices. Due to such client heterogeneity, we show that existing
federated learning algorithms suffer from a fundamental speed-accuracy
conflict: they either guarantee linear convergence but to an incorrect point,
or convergence to the global minimum but at a sub-linear rate, i.e., fast
convergence comes at the expense of accuracy.
To address the above limitation, we propose FedLin - a simple, new algorithm
that exploits past gradients and employs client-specific learning rates. When
the clients' local loss functions are smooth and strongly convex, we show that
FedLin guarantees linear convergence to the global minimum. We then establish
matching upper and lower bounds on the convergence rate of FedLin that
highlight the trade-offs associated with infrequent, periodic communication.
Notably, FedLin is the only approach that is able to match centralized
convergence rates (up to constants) for smooth strongly convex, convex, and
non-convex loss functions despite arbitrary objective and systems
heterogeneity. We further show that FedLin preserves linear convergence rates
under aggressive gradient sparsification, and quantify the effect of the
compression level on the convergence rate.
Related papers
- Local adapt-then-combine algorithms for distributed nonsmooth optimization: Achieving provable communication acceleration [50.67878993903822]
We propose a communication-efficient Adapt-Then-Combine (ATC) framework, FlexATC, unifying numerous ATC-based distributed algorithms.<n>We show for the first time that local updates provably lead to communication acceleration for ATC-based distributed algorithms.
arXiv Detail & Related papers (2026-02-18T02:47:05Z) - FedDuA: Doubly Adaptive Federated Learning [2.6108066206600555]
Federated learning is a distributed learning framework where clients collaboratively train a global model without sharing their raw data.<n>We formalize the central server optimization procedure through the lens of mirror descent and propose a novel framework, called FedDuA.<n>We prove that our proposed doubly adaptive step-size rule is minimax optimal and provide a convergence analysis for convex objectives.
arXiv Detail & Related papers (2025-05-16T11:15:27Z) - Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.
Non-smooth regularization is often incorporated into machine learning tasks.
We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z) - Communication Efficient Federated Learning with Linear Convergence on Heterogeneous Data [4.8305656901807055]
We propose a federated learning algorithm called FedCET to ensure accurate convergence under heterogeneous data distributions.
We prove that under appropriate learning rates, FedCET can ensure linear convergence to the exact solution.
arXiv Detail & Related papers (2025-03-20T02:43:02Z) - Aggregation on Learnable Manifolds for Asynchronous Federated Optimization [3.8208848658169763]
We introduce a geometric framework that casts aggregation as curve learning.<n>Within this, we propose AsyncBezier, which replaces linear aggregation with low-degree curvature components.<n>We show that these gains are preserved even when other methods are allocated a higher local compute budget.
arXiv Detail & Related papers (2025-03-18T16:36:59Z) - Analysis of regularized federated learning [8.489782750973005]
Federated learning is an efficient tool for dealing with heterogeneous big data and privacy protection.
Loop descent is often used for such methods on implementing big data, to reduce communication costs.
arXiv Detail & Related papers (2024-11-03T12:47:54Z) - Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata.
DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase.
Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z) - Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays [0.0]
Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients")
Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift")
We propose and analyze Asynchronous Exact Averaging (AREA), a new (sub)gradient algorithm that utilizes communication to speed up convergence and enhance scalability, and employs client memory to correct the client drift caused by variations in client update frequencies.
arXiv Detail & Related papers (2024-05-16T14:22:49Z) - Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training.
Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z) - FedZeN: Towards superlinear zeroth-order federated learning via
incremental Hessian estimation [1.45179582111722]
We design the first federated zeroth-order algorithm to estimate the curvature of the global objective.
We take an incremental Hessian estimator whose error norm converges linearly, and we adapt it to the federated zeroth-order setting.
We provide a theoretical analysis of our algorithm, named FedZeN, proving local quadratic convergence with high probability and global linear convergence up to zeroth-order precision.
arXiv Detail & Related papers (2023-09-29T12:13:41Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - Momentum Benefits Non-IID Federated Learning Simply and Provably [22.800862422479913]
Federated learning is a powerful paradigm for large-scale machine learning.
FedAvg and SCAFFOLD are two prominent algorithms to address these challenges.
This paper explores the utilization of momentum to enhance the performance of FedAvg and SCAFFOLD.
arXiv Detail & Related papers (2023-06-28T18:52:27Z) - Faster Non-Convex Federated Learning via Global and Local Momentum [57.52663209739171]
textttFedGLOMO is the first (first-order) FLtexttFedGLOMO algorithm.
Our algorithm is provably optimal even with communication between the clients and the server.
arXiv Detail & Related papers (2020-12-07T21:05:31Z) - Tackling the Objective Inconsistency Problem in Heterogeneous Federated
Optimization [93.78811018928583]
This paper provides a framework to analyze the convergence of federated heterogeneous optimization algorithms.
We propose FedNova, a normalized averaging method that eliminates objective inconsistency while preserving fast error convergence.
arXiv Detail & Related papers (2020-07-15T05:01:23Z) - A Unified Linear Speedup Analysis of Federated Averaging and Nesterov
FedAvg [49.76940694847521]
Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data.
In this paper, we focus on Federated Averaging (FedAvg), one of the most popular and effective FL algorithms in use today.
We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies.
arXiv Detail & Related papers (2020-07-11T05:59:08Z) - Federated Learning with Compression: Unified Analysis and Sharp
Guarantees [39.092596142018195]
Communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices.
Two notable trends to deal with the communication overhead of federated compression and computation are unreliable compression and heterogeneous communication.
We analyze their convergence in both homogeneous and heterogeneous data distribution settings.
arXiv Detail & Related papers (2020-07-02T14:44:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.