Related papers: Faster Federated Learning with Decaying Number of Local SGD Steps

Faster Federated Learning with Decaying Number of Local SGD Steps

URL: http://arxiv.org/abs/2305.09628v1
Date: Tue, 16 May 2023 17:36:34 GMT
Title: Faster Federated Learning with Decaying Number of Local SGD Steps
Authors: Jed Mills, Jia Hu, Geyong Min
Abstract summary: InNIST Learning (FL) devices collaboratively train a machine learning model without sharing their private data with a central or with other clients. In this work we propose $K$ as training progresses, which can jointly improve the final performance of FL model.
Score: 23.447883712141422
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In Federated Learning (FL) client devices connected over the internet collaboratively train a machine learning model without sharing their private data with a central server or with other clients. The seminal Federated Averaging (FedAvg) algorithm trains a single global model by performing rounds of local training on clients followed by model averaging. FedAvg can improve the communication-efficiency of training by performing more steps of Stochastic Gradient Descent (SGD) on clients in each round. However, client data in real-world FL is highly heterogeneous, which has been extensively shown to slow model convergence and harm final performance when $K > 1$ steps of SGD are performed on clients per round. In this work we propose decaying $K$ as training progresses, which can jointly improve the final performance of the FL model whilst reducing the wall-clock time and the total computational cost of training compared to using a fixed $K$. We analyse the convergence of FedAvg with decaying $K$ for strongly-convex objectives, providing novel insights into the convergence properties, and derive three theoretically-motivated decay schedules for $K$. We then perform thorough experiments on four benchmark FL datasets (FEMNIST, CIFAR100, Sentiment140, Shakespeare) to show the real-world benefit of our approaches in terms of real-world convergence time, computational cost, and generalisation performance.

Related papers

TRAIL: Trust-Aware Client Scheduling for Semi-Decentralized Federated Learning [13.144501509175985]
We propose a TRust-Aware clIent scheduLing mechanism called TRAIL, which assesses client states and contributions. We focus on a semi-decentralized FL framework where edge servers and clients train a shared global model using unreliable intra-cluster model aggregation and inter-cluster model consensus. Experiments conducted on real-world datasets demonstrate that TRAIL outperforms state-of-the-art baselines, achieving an improvement of 8.7% in test accuracy and a reduction of 15.3% in training loss.
arXiv Detail & Related papers (2024-12-16T05:02:50Z)
An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets. Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round. We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z)
Achieving Linear Speedup in Asynchronous Federated Learning with Heterogeneous Clients [30.135431295658343]
Federated learning (FL) aims to learn a common global model without exchanging or transferring the data that are stored locally at different clients. In this paper, we propose an efficient federated learning (AFL) framework called DeFedAvg. DeFedAvg is the first AFL algorithm that achieves the desirable linear speedup property, which indicates its high scalability.
arXiv Detail & Related papers (2024-02-17T05:22:46Z)
Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training. In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework. Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z)
Federated Learning for Semantic Parsing: Task Formulation, Evaluation Setup, New Algorithms [29.636944156801327]
Multiple clients collaboratively train one global model without sharing their semantic parsing data. Lorar adjusts each client's contribution to the global model update based on its training loss reduction during each round. Clients with smaller datasets enjoy larger performance gains.
arXiv Detail & Related papers (2023-05-26T19:25:49Z)
Federated Learning under Heterogeneous and Correlated Client Availability [10.05687757555923]
This paper presents the first convergence analysis for a FedAvg-like FL algorithm under heterogeneous and correlated client availability. We propose CA-Fed, a new FL algorithm that tries to balance the conflicting goals of maximizing convergence speed and minimizing model bias. Our experimental results show that CA-Fed achieves higher time-average accuracy and a lower standard deviation than state-of-the-art AdaFed and F3AST.
arXiv Detail & Related papers (2023-01-11T18:38:48Z)
Aergia: Leveraging Heterogeneity in Federated Learning Systems [5.0650178943079]
Federated Learning (FL) relies on clients to update a global model using their local datasets. Aergia is a novel approach where slow clients freeze the part of their model that is the most computationally intensive to train. Aergia significantly reduces the training time under heterogeneous settings by up to 27% and 53% compared to FedAvg and TiFL, respectively.
arXiv Detail & Related papers (2022-10-12T12:59:18Z)
Acceleration of Federated Learning with Alleviated Forgetting in Local Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy. We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage. Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
Dynamic Attention-based Communication-Efficient Federated Learning [85.18941440826309]
Federated learning (FL) offers a solution to train a global machine learning model. FL suffers performance degradation when client data distribution is non-IID. We propose a new adaptive training algorithm $textttAdaFL$ to combat this degradation.
arXiv Detail & Related papers (2021-08-12T14:18:05Z)
Towards Fair Federated Learning with Zero-Shot Data Augmentation [123.37082242750866]
Federated learning has emerged as an important distributed learning paradigm, where a server aggregates a global model from many client-trained models while having no access to the client data. We propose a novel federated learning system that employs zero-shot data augmentation on under-represented data to mitigate statistical heterogeneity and encourage more uniform accuracy performance across clients in federated networks. We study two variants of this scheme, Fed-ZDAC (federated learning with zero-shot data augmentation at the clients) and Fed-ZDAS (federated learning with zero-shot data augmentation at the server).
arXiv Detail & Related papers (2021-04-27T18:23:54Z)
Distributed Non-Convex Optimization with Sublinear Speedup under Intermittent Client Availability [46.85205907718874]
Federated learning is a new machine learning framework, where a bunch of clients collaboratively train a model without sharing training data. In this work, we consider a practical and issue when deploying federated learning in intermittent mobile environments. We propose a simple distributed nonlinear optimization algorithm, called Federated Latest Averaging (FedLaAvg for short) Our theoretical analysis shows that FedLaAvg attains the convergence rate of $(E1/2/(NT1/2)$, achieving a sublinear speed with respect to the total number of clients.
arXiv Detail & Related papers (2020-02-18T06:32:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.