Rethinking Normalization Methods in Federated Learning
- URL: http://arxiv.org/abs/2210.03277v1
- Date: Fri, 7 Oct 2022 01:32:24 GMT
- Title: Rethinking Normalization Methods in Federated Learning
- Authors: Zhixu Du, Jingwei Sun, Ang Li, Pin-Yu Chen, Jianyi Zhang, Hai "Helen"
Li, Yiran Chen
- Abstract summary: Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data.
We show that external covariate shifts will lead to the obliteration of some devices' contributions to the global model.
- Score: 92.25845185724424
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is a popular distributed learning framework that can
reduce privacy risks by not explicitly sharing private data. In this work, we
explicitly uncover external covariate shift problem in FL, which is caused by
the independent local training processes on different devices. We demonstrate
that external covariate shifts will lead to the obliteration of some devices'
contributions to the global model. Further, we show that normalization layers
are indispensable in FL since their inherited properties can alleviate the
problem of obliterating some devices' contributions. However, recent works have
shown that batch normalization, which is one of the standard components in many
deep neural networks, will incur accuracy drop of the global model in FL. The
essential reason for the failure of batch normalization in FL is poorly
studied. We unveil that external covariate shift is the key reason why batch
normalization is ineffective in FL. We also show that layer normalization is a
better choice in FL which can mitigate the external covariate shift and improve
the performance of the global model. We conduct experiments on CIFAR10 under
non-IID settings. The results demonstrate that models with layer normalization
converge fastest and achieve the best or comparable accuracy for three
different model architectures.
Related papers
- Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning? [50.03434441234569]
Federated Learning (FL) has gained significant popularity due to its effectiveness in training machine learning models across diverse sites without requiring direct data sharing.
While various algorithms have shown that FL with local updates is a communication-efficient distributed learning framework, the generalization performance of FL with local updates has received comparatively less attention.
arXiv Detail & Related papers (2024-09-05T19:00:18Z) - Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning.
As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers.
We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z) - FedNAR: Federated Optimization with Normalized Annealing Regularization [54.42032094044368]
We explore the choices of weight decay and identify that weight decay value appreciably influences the convergence of existing FL algorithms.
We develop Federated optimization with Normalized Annealing Regularization (FedNAR), a plug-in that can be seamlessly integrated into any existing FL algorithms.
arXiv Detail & Related papers (2023-10-04T21:11:40Z) - Understanding the Role of Layer Normalization in Label-Skewed Federated
Learning [15.19762600396105]
Layer normalization (LN) is a widely adopted deep learning technique especially in the era of foundation models.
In this work, we reveal the profound connection between layer normalization and the label shift problem in federated learning.
Our results verify that FN is an essential ingredient inside LN to significantly improve the convergence of FL while remaining robust to learning rate choices.
arXiv Detail & Related papers (2023-08-18T13:57:04Z) - Experimenting with Normalization Layers in Federated Learning on non-IID
scenarios [1.2599533416395765]
Federated Learning (FL) has been emerging as a method for privacy-preserving pooling of datasets employing collaborative training from different institutions.
One critical performance challenge of FL is operating on datasets not independently and identically distributed (non-IID) among the federation participants.
We benchmark five different normalization layers for training Neural Networks (NNs), two families of non-IID data skew, and two datasets.
Results show that Batch Normalization, widely employed for centralized DL, is not the best choice for FL, whereas Group and Layer Normalization consistently outperform Batch Normalization.
arXiv Detail & Related papers (2023-03-19T11:20:43Z) - Federated Learning on Heterogeneous and Long-Tailed Data via Classifier
Re-Training with Federated Features [24.679535905451758]
Federated learning (FL) provides a privacy-preserving solution for distributed machine learning tasks.
One challenging problem that severely damages the performance of FL models is the co-occurrence of data heterogeneity and long-tail distribution.
We propose a novel privacy-preserving FL method for heterogeneous and long-tailed data via Federated Re-training with Federated Features (CReFF)
arXiv Detail & Related papers (2022-04-28T10:35:11Z) - Fine-tuning Global Model via Data-Free Knowledge Distillation for
Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint.
We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG)
Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z) - Federated Dropout -- A Simple Approach for Enabling Federated Learning
on Resource Constrained Devices [40.69663094185572]
Federated learning (FL) is a popular framework for training an AI model using distributed mobile data in a wireless network.
One main challenge confronting practical FL is that resource constrained devices struggle with the computation intensive task of updating a deep-neural network model.
To tackle the challenge, in this paper, a federated dropout (FedDrop) scheme is proposed building on the classic dropout scheme for random model pruning.
arXiv Detail & Related papers (2021-09-30T16:52:13Z) - UVeQFed: Universal Vector Quantization for Federated Learning [179.06583469293386]
Federated learning (FL) is an emerging approach to train such learning models without requiring the users to share their possibly private labeled data.
In FL, each user trains its copy of the learning model locally. The server then collects the individual updates and aggregates them into a global model.
We show that combining universal vector quantization methods with FL yields a decentralized training system in which the compression of the trained models induces only a minimum distortion.
arXiv Detail & Related papers (2020-06-05T07:10:22Z) - Federated learning with hierarchical clustering of local updates to
improve training on non-IID data [3.3517146652431378]
We show that learning a single joint model is often not optimal in the presence of certain types of non-iid data.
We present a modification to FL by introducing a hierarchical clustering step (FL+HC)
We show how FL+HC allows model training to converge in fewer communication rounds compared to FL without clustering.
arXiv Detail & Related papers (2020-04-24T15:16:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.