FLeNS: Federated Learning with Enhanced Nesterov-Newton Sketch
- URL: http://arxiv.org/abs/2409.15216v2
- Date: Tue, 1 Oct 2024 11:20:53 GMT
- Title: FLeNS: Federated Learning with Enhanced Nesterov-Newton Sketch
- Authors: Sunny Gupta, Mohit Jindal, Pankhi Kashyap, Pranav Jeevan, Amit Sethi,
- Abstract summary: We introduce Federated Learning with Enhanced Nesterov-Newton Sketch (FLeNS)
FLeNS approximates the centralized Newton's method without relying on the exact Hessian.
We provide rigorous convergence guarantees and characterize tradeoffs between acceleration, sketch size, and convergence speed.
- Score: 3.4927288761640565
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning faces a critical challenge in balancing communication efficiency with rapid convergence, especially for second-order methods. While Newton-type algorithms achieve linear convergence in communication rounds, transmitting full Hessian matrices is often impractical due to quadratic complexity. We introduce Federated Learning with Enhanced Nesterov-Newton Sketch (FLeNS), a novel method that harnesses both the acceleration capabilities of Nesterov's method and the dimensionality reduction benefits of Hessian sketching. FLeNS approximates the centralized Newton's method without relying on the exact Hessian, significantly reducing communication overhead. By combining Nesterov's acceleration with adaptive Hessian sketching, FLeNS preserves crucial second-order information while preserving the rapid convergence characteristics. Our theoretical analysis, grounded in statistical learning, demonstrates that FLeNS achieves super-linear convergence rates in communication rounds - a notable advancement in federated optimization. We provide rigorous convergence guarantees and characterize tradeoffs between acceleration, sketch size, and convergence speed. Extensive empirical evaluation validates our theoretical findings, showcasing FLeNS's state-of-the-art performance with reduced communication requirements, particularly in privacy-sensitive and edge-computing scenarios. The code is available at https://github.com/sunnyinAI/FLeNS
Related papers
- Sketched Adaptive Federated Deep Learning: A Sharp Convergence Analysis [7.303912285452846]
We introduce specific sketched adaptive federated learning (SAFL) algorithms with guarantees on communication cost depending only logarithmically (instead of linearly) on the ambient dimension.
Our theoretical claims are supported by empirical studies on vision and language tasks, and in both fine-tuning and training-from-scratch.
Surprisingly, the proposed SAFL methods are competitive with the state-of-the-art communication-efficient federated learning algorithms based on error feedback.
arXiv Detail & Related papers (2024-11-11T07:51:22Z) - Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata.
DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase.
Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z) - FedNS: A Fast Sketching Newton-Type Algorithm for Federated Learning [27.957498393822338]
We introduce a novel approach to tackle this issue while still achieving fast convergence rates.
Our proposed method, named as Federated Newton Sketch methods (FedNS), approximates the centralized Newton's method by communicating the sketched square-root Hessian instead of the exact Hessian.
We provide convergence analysis based on statistical learning for the federated Newton sketch approaches.
arXiv Detail & Related papers (2024-01-05T10:06:41Z) - Rethinking SIGN Training: Provable Nonconvex Acceleration without First-
and Second-Order Gradient Lipschitz [66.22095739795068]
Sign-based methods have gained attention due to their ability to achieve robust performance despite only using only the sign information for parameter updates.
The current convergence analysis of sign-based methods relies on the strong assumptions of first-order acceleration and second-order acceleration.
In this paper we analyze their convergence under more realistic assumptions of first- and second-order acceleration.
arXiv Detail & Related papers (2023-10-23T06:48:43Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - FedNew: A Communication-Efficient and Privacy-Preserving Newton-Type
Method for Federated Learning [75.46959684676371]
We introduce a novel framework called FedNew in which there is no need to transmit Hessian information from clients to PS.
FedNew hides the gradient information and results in a privacy-preserving approach compared to the existing state-of-the-art.
arXiv Detail & Related papers (2022-06-17T15:21:39Z) - Fast Federated Learning by Balancing Communication Trade-Offs [9.89867121050673]
Federated Learning (FL) has recently received a lot of attention for large-scale privacy-preserving machine learning.
High communication overheads due to frequent gradient transmissions decelerate FL.
We propose an enhanced FL scheme, namely Fast FL (FFL), that jointly and dynamically adjusts the two variables to minimize the learning error.
arXiv Detail & Related papers (2021-05-23T21:55:14Z) - Stragglers Are Not Disaster: A Hybrid Federated Learning Algorithm with
Delayed Gradients [21.63719641718363]
Federated learning (FL) is a new machine learning framework which trains a joint model across a large amount of decentralized computing devices.
This paper presents a novel FL algorithm, namely Hybrid Federated Learning (HFL), to achieve a learning balance in efficiency and effectiveness.
arXiv Detail & Related papers (2021-02-12T02:27:44Z) - DONE: Distributed Approximate Newton-type Method for Federated Edge
Learning [41.20946186966816]
DONE is a distributed approximate Newton-type algorithm with fast convergence rate.
We show DONE attains a comparable performance to the Newton's method.
arXiv Detail & Related papers (2020-12-10T12:25:34Z) - A Unified Linear Speedup Analysis of Federated Averaging and Nesterov
FedAvg [49.76940694847521]
Federated learning (FL) learns a model jointly from a set of participating devices without sharing each other's privately held data.
In this paper, we focus on Federated Averaging (FedAvg), one of the most popular and effective FL algorithms in use today.
We show that FedAvg enjoys linear speedup in each case, although with different convergence rates and communication efficiencies.
arXiv Detail & Related papers (2020-07-11T05:59:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.