Related papers: Painless Federated Learning: An Interplay of Line-Search and Extrapolation

Painless Federated Learning: An Interplay of Line-Search and Extrapolation

URL: http://arxiv.org/abs/2408.17145v2
Date: Sun, 26 Oct 2025 06:43:20 GMT
Title: Painless Federated Learning: An Interplay of Line-Search and Extrapolation
Authors: Geetika, Somya Tyagi, Bapi Chatterjee,
Abstract summary: We introduce Federated Line Search (FedSLS) algorithm and show that it achieves deterministic rates in expectation.<n>FedSLS offers linear convergence for objectives even with partial client participation.
Score: 0.6193838300896447
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The classical line search for learning rate (LR) tuning in the stochastic gradient descent (SGD) algorithm can tame the convergence slowdown due to data-sampling noise. In a federated setting, wherein the client heterogeneity introduces a slowdown to the global convergence, line search can be relevantly adapted. In this work, we show that a stochastic variant of line search tames the heterogeneity in federated optimization in addition to that due to client-local gradient noise. To this end, we introduce Federated Stochastic Line Search (FedSLS) algorithm and show that it achieves deterministic rates in expectation. Specifically, FedSLS offers linear convergence for strongly convex objectives even with partial client participation. Recently, the extrapolation of the server's LR has shown promise for improved empirical performance for federated learning. To benefit from extrapolation, we extend FedSLS to Federated Extrapolated Stochastic Line Search (FedExpSLS) and prove its convergence. Our extensive empirical results show that the proposed methods perform at par or better than the popular federated learning algorithms across many convex and non-convex problems.

Related papers

FedDuA: Doubly Adaptive Federated Learning [2.6108066206600555]
Federated learning is a distributed learning framework where clients collaboratively train a global model without sharing their raw data.<n>We formalize the central server optimization procedure through the lens of mirror descent and propose a novel framework, called FedDuA.<n>We prove that our proposed doubly adaptive step-size rule is minimax optimal and provide a convergence analysis for convex objectives.
arXiv Detail & Related papers (2025-05-16T11:15:27Z)
Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
Over-the-Air Fair Federated Learning via Multi-Objective Optimization [52.295563400314094]
We propose an over-the-air fair federated learning algorithm (OTA-FFL) to train fair FL models. Experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance.
arXiv Detail & Related papers (2025-01-06T21:16:51Z)
Pursuing Overall Welfare in Federated Learning through Sequential Decision Making [10.377683220196873]
In traditional federated learning, a single global model cannot perform equally well for all clients. Our work reveals that existing fairness-aware aggregation strategies can be unified into an online convex optimization framework. AAggFF achieves better degree of client-level fairness than existing methods in both practical settings.
arXiv Detail & Related papers (2024-05-31T14:15:44Z)
FedCAda: Adaptive Client-Side Optimization for Accelerated and Stable Federated Learning [57.38427653043984]
Federated learning (FL) has emerged as a prominent approach for collaborative training of machine learning models across distributed clients. We introduce FedCAda, an innovative federated client adaptive algorithm designed to tackle this challenge. We demonstrate that FedCAda outperforms the state-of-the-art methods in terms of adaptability, convergence, stability, and overall performance.
arXiv Detail & Related papers (2024-05-20T06:12:33Z)
Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training. Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z)
Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training. We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z)
FedZeN: Towards superlinear zeroth-order federated learning via incremental Hessian estimation [1.45179582111722]
We design the first federated zeroth-order algorithm to estimate the curvature of the global objective. We take an incremental Hessian estimator whose error norm converges linearly, and we adapt it to the federated zeroth-order setting. We provide a theoretical analysis of our algorithm, named FedZeN, proving local quadratic convergence with high probability and global linear convergence up to zeroth-order precision.
arXiv Detail & Related papers (2023-09-29T12:13:41Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
Locally Adaptive Federated Learning [30.19411641685853]
Federated learning is a paradigm of distributed machine learning in which multiple clients coordinate with a central server to learn a model. Standard federated optimization methods such as Federated Averaging (FedAvg) ensure generalization among the clients. We propose locally federated learning algorithms, that leverage the local geometric information for each client function.
arXiv Detail & Related papers (2023-07-12T17:02:32Z)
Ordering for Non-Replacement SGD [7.11967773739707]
We seek to find an ordering that can improve the convergence rates for the non-replacement form of the algorithm. We develop optimal orderings for constant and decreasing step sizes for strongly convex and convex functions. In addition, we are able to combine the ordering with mini-batch and further apply it to more complex neural networks.
arXiv Detail & Related papers (2023-06-28T00:46:58Z)
Stochastic Unrolled Federated Learning [85.6993263983062]
We introduce UnRolled Federated learning (SURF), a method that expands algorithm unrolling to federated learning. Our proposed method tackles two challenges of this expansion, namely the need to feed whole datasets to the unrolleds and the decentralized nature of federated learning.
arXiv Detail & Related papers (2023-05-24T17:26:22Z)
FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data. We propose textbfFedDA, a novel framework for local adaptive gradient methods. We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z)
Federated Learning as Variational Inference: A Scalable Expectation Propagation Approach [66.9033666087719]
This paper extends the inference view and describes a variational inference formulation of federated learning. We apply FedEP on standard federated learning benchmarks and find that it outperforms strong baselines in terms of both convergence speed and accuracy.
arXiv Detail & Related papers (2023-02-08T17:58:11Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)
Gradient Masked Averaging for Federated Learning [24.687254139644736]
Federated learning allows a large number of clients with heterogeneous data to coordinate learning of a unified global model. Standard FL algorithms involve averaging of model parameters or gradient updates to approximate the global model at the server. We propose a gradient masked averaging approach for FL as an alternative to the standard averaging of client updates.
arXiv Detail & Related papers (2022-01-28T08:42:43Z)
Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement. For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts. We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z)
Achieving Linear Convergence in Federated Learning under Objective and Systems Heterogeneity [24.95640915217946]
FedLin is a simple, new algorithm that exploits past gradients and employs client-specific learning rates. We show that FedLin is the only approach that is able to match centralized convergence rates (up constants) for smooth convex, convex, and non-linear loss functions.
arXiv Detail & Related papers (2021-02-14T02:47:35Z)
On The Verification of Neural ODEs with Stochastic Guarantees [14.490826225393096]
We show that Neural ODEs, an emerging class of timecontinuous neural networks, can be verified by solving a set of global-optimization problems. We introduce Lagran Reachability ( SLR), an abstraction-based technique for constructing a tight Reachtube.
arXiv Detail & Related papers (2020-12-16T11:04:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.