SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Non-convex Cross-Device Federated Learning
- URL: http://arxiv.org/abs/2405.20127v1
- Date: Thu, 30 May 2024 15:07:30 GMT
- Title: SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Non-convex Cross-Device Federated Learning
- Authors: Avetik Karagulyan, Egor Shulgin, Abdurakhmon Sadiev, Peter Richtárik,
- Abstract summary: Cross-device training is a subfield of learning where the number of clients can reach into the billions.
Standard approaches and local methods are prone to issues as crucial as cross-device similarity.
Our method is the first in its kind, that does not require the objective and provably benefits from clients having similar data.
- Score: 48.072207894076556
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-device training is a crucial subfield of federated learning, where the number of clients can reach into the billions. Standard approaches and local methods are prone to issues such as client drift and insensitivity to data similarities. We propose a novel algorithm (SPAM) for cross-device federated learning with non-convex losses, which solves both issues. We provide sharp analysis under second-order (Hessian) similarity, a condition satisfied by a variety of machine learning problems in practice. Additionally, we extend our results to the partial participation setting, where a cohort of selected clients communicate with the server at each communication round. Our method is the first in its kind, that does not require the smoothness of the objective and provably benefits from clients having similar data.
Related papers
- Accelerated Stochastic ExtraGradient: Mixing Hessian and Gradient Similarity to Reduce Communication in Distributed and Federated Learning [50.382793324572845]
Distributed computing involves communication between devices, which requires solving two key problems: efficiency and privacy.
In this paper, we analyze a new method that incorporates the ideas of using data similarity and clients sampling.
To address privacy concerns, we apply the technique of additional noise and analyze its impact on the convergence of the proposed method.
arXiv Detail & Related papers (2024-09-22T00:49:10Z) - Cohort Squeeze: Beyond a Single Communication Round per Cohort in Cross-Device Federated Learning [51.560590617691005]
We investigate whether it is possible to squeeze more juice" out of each cohort than what is possible in a single communication round.
Our approach leads to up to 74% reduction in the total communication cost needed to train a FL model in the cross-device setting.
arXiv Detail & Related papers (2024-06-03T08:48:49Z) - Learn What You Need in Personalized Federated Learning [53.83081622573734]
$textitLearn2pFed$ is a novel algorithm-unrolling-based personalized federated learning framework.
We show that $textitLearn2pFed$ significantly outperforms previous personalized federated learning methods.
arXiv Detail & Related papers (2024-01-16T12:45:15Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - Effectively Heterogeneous Federated Learning: A Pairing and Split
Learning Based Approach [16.093068118849246]
This paper presents a novel split federated learning (SFL) framework that pairs clients with different computational resources.
A greedy algorithm is proposed by reconstructing the optimization of training latency as a graph edge selection problem.
Simulation results show the proposed method can significantly improve the FL training speed and achieve high performance.
arXiv Detail & Related papers (2023-08-26T11:10:54Z) - Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions.
We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles.
Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z) - Addressing Client Drift in Federated Continual Learning with Adaptive
Optimization [10.303676184878896]
We outline a framework for performing Federated Continual Learning (FCL) by using NetTailor as a candidate continual learning approach.
We show that adaptive federated optimization can reduce the adverse impact of client drift and showcase its effectiveness on CIFAR100, MiniImagenet, and Decathlon benchmarks.
arXiv Detail & Related papers (2022-03-24T20:00:03Z) - Speeding up Heterogeneous Federated Learning with Sequentially Trained
Superclients [19.496278017418113]
Federated Learning (FL) allows training machine learning models in privacy-constrained scenarios by enabling the cooperation of edge devices without requiring local data sharing.
This approach raises several challenges due to the different statistical distribution of the local datasets and the clients' computational heterogeneity.
We propose FedSeq, a novel framework leveraging the sequential training of subgroups of heterogeneous clients, i.e. superclients, to emulate the centralized paradigm in a privacy-compliant way.
arXiv Detail & Related papers (2022-01-26T12:33:23Z) - Communication-Efficient Agnostic Federated Averaging [39.761808414613185]
In distributed learning settings, the training algorithm can be potentially biased towards different clients.
We propose a communication-efficient distributed algorithm called Agnostic Federated Averaging (or AgnosticFedAvg) to minimize the domain-agnostic objective proposed in Mohri et al.
arXiv Detail & Related papers (2021-04-06T19:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.