FedBug: A Bottom-Up Gradual Unfreezing Framework for Federated Learning
- URL: http://arxiv.org/abs/2307.10317v2
- Date: Mon, 13 Nov 2023 16:57:10 GMT
- Title: FedBug: A Bottom-Up Gradual Unfreezing Framework for Federated Learning
- Authors: Chia-Hsiang Kao, Yu-Chiang Frank Wang
- Abstract summary: Federated Learning (FL) offers a collaborative training framework, allowing multiple clients to contribute to a shared model.
Due to the heterogeneous nature of local datasets, updated client models may overfit and diverge from one another, commonly known as the problem of client drift.
We propose FedBug, a novel FL framework designed to effectively mitigate client drift.
- Score: 36.18217687935658
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated Learning (FL) offers a collaborative training framework, allowing
multiple clients to contribute to a shared model without compromising data
privacy. Due to the heterogeneous nature of local datasets, updated client
models may overfit and diverge from one another, commonly known as the problem
of client drift. In this paper, we propose FedBug (Federated Learning with
Bottom-Up Gradual Unfreezing), a novel FL framework designed to effectively
mitigate client drift. FedBug adaptively leverages the client model parameters,
distributed by the server at each global round, as the reference points for
cross-client alignment. Specifically, on the client side, FedBug begins by
freezing the entire model, then gradually unfreezes the layers, from the input
layer to the output layer. This bottom-up approach allows models to train the
newly thawed layers to project data into a latent space, wherein the separating
hyperplanes remain consistent across all clients. We theoretically analyze
FedBug in a novel over-parameterization FL setup, revealing its superior
convergence rate compared to FedAvg. Through comprehensive experiments,
spanning various datasets, training conditions, and network architectures, we
validate the efficacy of FedBug. Our contributions encompass a novel FL
framework, theoretical analysis, and empirical validation, demonstrating the
wide potential and applicability of FedBug.
Related papers
- An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - FedImpro: Measuring and Improving Client Update in Federated Learning [77.68805026788836]
Federated Learning (FL) models often experience client drift caused by heterogeneous data.
We present an alternative perspective on client drift and aim to mitigate it by generating improved local models.
arXiv Detail & Related papers (2024-02-10T18:14:57Z) - Rethinking Client Drift in Federated Learning: A Logit Perspective [125.35844582366441]
Federated Learning (FL) enables multiple clients to collaboratively learn in a distributed way, allowing for privacy protection.
We find that the difference in logits between the local and global models increases as the model is continuously updated.
We propose a new algorithm, named FedCSD, a Class prototype Similarity Distillation in a federated framework to align the local and global models.
arXiv Detail & Related papers (2023-08-20T04:41:01Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - Federated Learning for Semantic Parsing: Task Formulation, Evaluation
Setup, New Algorithms [29.636944156801327]
Multiple clients collaboratively train one global model without sharing their semantic parsing data.
Lorar adjusts each client's contribution to the global model update based on its training loss reduction during each round.
Clients with smaller datasets enjoy larger performance gains.
arXiv Detail & Related papers (2023-05-26T19:25:49Z) - Fine-tuning Global Model via Data-Free Knowledge Distillation for
Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint.
We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG)
Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z) - FedNS: Improving Federated Learning for collaborative image
classification on mobile clients [22.980223900446997]
Federated Learning (FL) is a paradigm that aims to support loosely connected clients in learning a global model.
We propose a new approach, termed Federated Node Selection (FedNS), for the server's global model aggregation in the FL setting.
We show with experiments from multiple datasets and networks that FedNS can consistently achieve improved performance over FedAvg.
arXiv Detail & Related papers (2021-01-20T06:45:46Z) - Blockchain Assisted Decentralized Federated Learning (BLADE-FL):
Performance Analysis and Resource Allocation [119.19061102064497]
We propose a decentralized FL framework by integrating blockchain into FL, namely, blockchain assisted decentralized federated learning (BLADE-FL)
In a round of the proposed BLADE-FL, each client broadcasts its trained model to other clients, competes to generate a block based on the received models, and then aggregates the models from the generated block before its local training of the next round.
We explore the impact of lazy clients on the learning performance of BLADE-FL, and characterize the relationship among the optimal K, the learning parameters, and the proportion of lazy clients.
arXiv Detail & Related papers (2021-01-18T07:19:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.