When MiniBatch SGD Meets SplitFed Learning:Convergence Analysis and
Performance Evaluation
- URL: http://arxiv.org/abs/2308.11953v1
- Date: Wed, 23 Aug 2023 06:51:22 GMT
- Title: When MiniBatch SGD Meets SplitFed Learning:Convergence Analysis and
Performance Evaluation
- Authors: Chao Huang, Geng Tian, Ming Tang
- Abstract summary: Federated learning (FL) enables collaborative model training across distributed clients without sharing raw data.
SplitFed learning (SFL) is a recent distributed approach that alleviates computation workload at the client device by splitting the model at a cut layer into two parts.
MiniBatch-SFL incorporates MiniBatch SGD into SFL, where the clients train the client-side model in an FL fashion while the server trains the server-side model.
- Score: 9.815046814597238
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) enables collaborative model training across
distributed clients (e.g., edge devices) without sharing raw data. Yet, FL can
be computationally expensive as the clients need to train the entire model
multiple times. SplitFed learning (SFL) is a recent distributed approach that
alleviates computation workload at the client device by splitting the model at
a cut layer into two parts, where clients only need to train part of the model.
However, SFL still suffers from the \textit{client drift} problem when clients'
data are highly non-IID. To address this issue, we propose MiniBatch-SFL. This
algorithm incorporates MiniBatch SGD into SFL, where the clients train the
client-side model in an FL fashion while the server trains the server-side
model similar to MiniBatch SGD. We analyze the convergence of MiniBatch-SFL and
show that the bound of the expected loss can be obtained by analyzing the
expected server-side and client-side model updates, respectively. The
server-side updates do not depend on the non-IID degree of the clients'
datasets and can potentially mitigate client drift. However, the client-side
model relies on the non-IID degree and can be optimized by properly choosing
the cut layer. Perhaps counter-intuitive, our empirical result shows that a
latter position of the cut layer leads to a smaller average gradient divergence
and a better algorithm performance. Moreover, numerical results show that
MiniBatch-SFL achieves higher accuracy than conventional SFL and FL. The
accuracy improvement can be up to 24.1\% and 17.1\% with highly non-IID data,
respectively.
Related papers
- Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - PFSL: Personalized & Fair Split Learning with Data & Label Privacy for
thin clients [0.5144809478361603]
PFSL is a new framework of distributed split learning where a large number of thin clients perform transfer learning in parallel.
We implement a lightweight step of personalization of client models to provide high performance for their respective data distributions.
Our accuracy far exceeds that of current algorithms SL and is very close to that of centralized learning on several real-life benchmarks.
arXiv Detail & Related papers (2023-03-19T10:38:29Z) - Client Selection for Generalization in Accelerated Federated Learning: A
Multi-Armed Bandit Approach [20.300740276237523]
Federated learning (FL) is an emerging machine learning (ML) paradigm used to train models across multiple nodes (i.e., clients) holding local data sets.
We develop a novel algorithm to achieve this goal, dubbed Bandit Scheduling for FL (BSFL)
arXiv Detail & Related papers (2023-03-18T09:45:58Z) - Complement Sparsification: Low-Overhead Model Pruning for Federated
Learning [2.0428960719376166]
Federated Learning (FL) is a privacy-preserving distributed deep learning paradigm that involves substantial communication and computation effort.
Existing model pruning/sparsification solutions cannot satisfy the requirements for low bidirectional communication overhead between the server and the clients.
We propose Complement Sparsification (CS), a pruning mechanism that satisfies all these requirements through a complementary and collaborative pruning done at the server and the clients.
arXiv Detail & Related papers (2023-03-10T23:07:02Z) - FedCliP: Federated Learning with Client Pruning [3.796320380104124]
Federated learning (FL) is a newly emerging distributed learning paradigm.
One fundamental bottleneck in FL is the heavy communication overheads between the distributed clients and the central server.
We propose FedCliP, the first communication efficient FL training framework from a macro perspective.
arXiv Detail & Related papers (2023-01-17T09:15:37Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - No One Left Behind: Inclusive Federated Learning over Heterogeneous
Devices [79.16481453598266]
We propose InclusiveFL, a client-inclusive federated learning method to handle this problem.
The core idea of InclusiveFL is to assign models of different sizes to clients with different computing capabilities.
We also propose an effective method to share the knowledge among multiple local models with different sizes.
arXiv Detail & Related papers (2022-02-16T13:03:27Z) - Splitfed learning without client-side synchronization: Analyzing
client-side split network portion size to overall performance [4.689140226545214]
Federated Learning (FL), Split Learning (SL), and SplitFed Learning (SFL) are three recent developments in distributed machine learning.
This paper studies SFL without client-side model synchronization.
It provides only 1%-2% better accuracy than Multi-head Split Learning on the MNIST test set.
arXiv Detail & Related papers (2021-09-19T22:57:23Z) - A Bayesian Federated Learning Framework with Online Laplace
Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model.
We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side.
We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z) - Ensemble Distillation for Robust Model Fusion in Federated Learning [72.61259487233214]
Federated Learning (FL) is a machine learning setting where many devices collaboratively train a machine learning model.
In most of the current training schemes the central model is refined by averaging the parameters of the server model and the updated parameters from the client side.
We propose ensemble distillation for model fusion, i.e. training the central classifier through unlabeled data on the outputs of the models from the clients.
arXiv Detail & Related papers (2020-06-12T14:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.