Experimenting with Normalization Layers in Federated Learning on non-IID
scenarios
- URL: http://arxiv.org/abs/2303.10630v1
- Date: Sun, 19 Mar 2023 11:20:43 GMT
- Title: Experimenting with Normalization Layers in Federated Learning on non-IID
scenarios
- Authors: Bruno Casella, Roberto Esposito, Antonio Sciarappa, Carlo Cavazzoni,
Marco Aldinucci
- Abstract summary: Federated Learning (FL) has been emerging as a method for privacy-preserving pooling of datasets employing collaborative training from different institutions.
One critical performance challenge of FL is operating on datasets not independently and identically distributed (non-IID) among the federation participants.
We benchmark five different normalization layers for training Neural Networks (NNs), two families of non-IID data skew, and two datasets.
Results show that Batch Normalization, widely employed for centralized DL, is not the best choice for FL, whereas Group and Layer Normalization consistently outperform Batch Normalization.
- Score: 1.2599533416395765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training Deep Learning (DL) models require large, high-quality datasets,
often assembled with data from different institutions. Federated Learning (FL)
has been emerging as a method for privacy-preserving pooling of datasets
employing collaborative training from different institutions by iteratively
globally aggregating locally trained models. One critical performance challenge
of FL is operating on datasets not independently and identically distributed
(non-IID) among the federation participants. Even though this fragility cannot
be eliminated, it can be debunked by a suitable optimization of two
hyper-parameters: layer normalization methods and collaboration frequency
selection. In this work, we benchmark five different normalization layers for
training Neural Networks (NNs), two families of non-IID data skew, and two
datasets. Results show that Batch Normalization, widely employed for
centralized DL, is not the best choice for FL, whereas Group and Layer
Normalization consistently outperform Batch Normalization. Similarly, frequent
model aggregation decreases convergence speed and mode quality.
Related papers
- Multi-level Personalized Federated Learning on Heterogeneous and Long-Tailed Data [10.64629029156029]
We introduce an innovative personalized Federated Learning framework, Multi-level Personalized Federated Learning (MuPFL)
MuPFL integrates three pivotal modules: Biased Activation Value Dropout (BAVD), Adaptive Cluster-based Model Update (ACMU) and Prior Knowledge-assisted Fine-tuning (PKCF)
Experiments on diverse real-world datasets show that MuPFL consistently outperforms state-of-the-art baselines, even under extreme non-i.i.d. and long-tail conditions.
arXiv Detail & Related papers (2024-05-10T11:52:53Z) - FedUV: Uniformity and Variance for Heterogeneous Federated Learning [5.9330433627374815]
Federated learning is a promising framework to train neural networks with widely distributed data.
Recent work has shown this is due to the final layer of the network being most prone to local bias.
We investigate the training dynamics of the classifier by applying SVD to the weights motivated by the observation that freezing weights results in constant singular values.
arXiv Detail & Related papers (2024-02-27T15:53:15Z) - Federated Two Stage Decoupling With Adaptive Personalization Layers [5.69361786082969]
Federated learning has gained significant attention due to its ability to enable distributed learning while maintaining privacy constraints.
It inherently experiences significant learning degradation and slow convergence speed.
It is natural to employ the concept of clustering homogeneous clients into the same group, allowing only the model weights within each group to be aggregated.
arXiv Detail & Related papers (2023-08-30T07:46:32Z) - Integrating Local Real Data with Global Gradient Prototypes for
Classifier Re-Balancing in Federated Long-Tailed Learning [60.41501515192088]
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively.
The data samples usually follow a long-tailed distribution in the real world, and FL on the decentralized and long-tailed data yields a poorly-behaved global model.
In this work, we integrate the local real data with the global gradient prototypes to form the local balanced datasets.
arXiv Detail & Related papers (2023-01-25T03:18:10Z) - Rethinking Normalization Methods in Federated Learning [92.25845185724424]
Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data.
We show that external covariate shifts will lead to the obliteration of some devices' contributions to the global model.
arXiv Detail & Related papers (2022-10-07T01:32:24Z) - Heterogeneous Federated Learning via Grouped Sequential-to-Parallel
Training [60.892342868936865]
Federated learning (FL) is a rapidly growing privacy-preserving collaborative machine learning paradigm.
We propose a data heterogeneous-robust FL approach, FedGSP, to address this challenge.
We show that FedGSP improves the accuracy by 3.7% on average compared with seven state-of-the-art approaches.
arXiv Detail & Related papers (2022-01-31T03:15:28Z) - Multi-Center Federated Learning [62.32725938999433]
Federated learning (FL) can protect data privacy in distributed learning.
It merely collects local gradients from users without access to their data.
We propose a novel multi-center aggregation mechanism.
arXiv Detail & Related papers (2021-08-19T12:20:31Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - FedSemi: An Adaptive Federated Semi-Supervised Learning Framework [23.90642104477983]
Federated learning (FL) has emerged as an effective technique to co-training machine learning models without actually sharing data and leaking privacy.
Most existing FL methods focus on the supervised setting and ignore the utilization of unlabeled data.
We propose FedSemi, a novel, adaptive, and general framework, which firstly introduces the consistency regularization into FL using a teacher-student model.
arXiv Detail & Related papers (2020-12-06T15:46:04Z) - Multi-Center Federated Learning [62.57229809407692]
This paper proposes a novel multi-center aggregation mechanism for federated learning.
It learns multiple global models from the non-IID user data and simultaneously derives the optimal matching between users and centers.
Our experimental results on benchmark datasets show that our method outperforms several popular federated learning methods.
arXiv Detail & Related papers (2020-05-03T09:14:31Z) - Federated learning with hierarchical clustering of local updates to
improve training on non-IID data [3.3517146652431378]
We show that learning a single joint model is often not optimal in the presence of certain types of non-iid data.
We present a modification to FL by introducing a hierarchical clustering step (FL+HC)
We show how FL+HC allows model training to converge in fewer communication rounds compared to FL without clustering.
arXiv Detail & Related papers (2020-04-24T15:16:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.