Characterizing Impacts of Heterogeneity in Federated Learning upon
Large-Scale Smartphone Data
- URL: http://arxiv.org/abs/2006.06983v4
- Date: Fri, 12 Mar 2021 08:45:11 GMT
- Title: Characterizing Impacts of Heterogeneity in Federated Learning upon
Large-Scale Smartphone Data
- Authors: Chengxu Yang, Qipeng Wang, Mengwei Xu, Zhenpeng Chen, Kaigui Bian,
Yunxin Liu, Xuanzhe Liu
- Abstract summary: Federated learning (FL) is an emerging, privacy-preserving machine learning paradigm, drawing tremendous attention in academia and industry.
A unique characteristic of FL is heterogeneity, which resides in the various hardware specifications and dynamic states across the participating devices.
We conduct extensive experiments to compare the performance of state-of-the-art FL algorithms under Heterogeneous and Heterogeneous-unaware settings.
- Score: 23.67491703843822
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is an emerging, privacy-preserving machine learning
paradigm, drawing tremendous attention in both academia and industry. A unique
characteristic of FL is heterogeneity, which resides in the various hardware
specifications and dynamic states across the participating devices.
Theoretically, heterogeneity can exert a huge influence on the FL training
process, e.g., causing a device unavailable for training or unable to upload
its model updates. Unfortunately, these impacts have never been systematically
studied and quantified in existing FL literature.
In this paper, we carry out the first empirical study to characterize the
impacts of heterogeneity in FL. We collect large-scale data from 136k
smartphones that can faithfully reflect heterogeneity in real-world settings.
We also build a heterogeneity-aware FL platform that complies with the standard
FL protocol but with heterogeneity in consideration. Based on the data and the
platform, we conduct extensive experiments to compare the performance of
state-of-the-art FL algorithms under heterogeneity-aware and
heterogeneity-unaware settings. Results show that heterogeneity causes
non-trivial performance degradation in FL, including up to 9.2% accuracy drop,
2.32x lengthened training time, and undermined fairness. Furthermore, we
analyze potential impact factors and find that device failure and participant
bias are two potential factors for performance degradation. Our study provides
insightful implications for FL practitioners. On the one hand, our findings
suggest that FL algorithm designers consider necessary heterogeneity during the
evaluation. On the other hand, our findings urge system providers to design
specific mechanisms to mitigate the impacts of heterogeneity.
Related papers
- Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning? [50.03434441234569]
Federated Learning (FL) has gained significant popularity due to its effectiveness in training machine learning models across diverse sites without requiring direct data sharing.
While various algorithms have shown that FL with local updates is a communication-efficient distributed learning framework, the generalization performance of FL with local updates has received comparatively less attention.
arXiv Detail & Related papers (2024-09-05T19:00:18Z) - HeteroSwitch: Characterizing and Taming System-Induced Data Heterogeneity in Federated Learning [36.00729012296371]
Federated Learning (FL) is a practical approach to train deep learning models collaboratively across user-end devices.
In FL, participating user-end devices are highly fragmented in terms of hardware and software configurations.
We propose HeteroSwitch, which adaptively adopts generalization techniques depending on the level of bias caused by varying HW and SW configurations.
arXiv Detail & Related papers (2024-03-07T04:23:07Z) - Fake It Till Make It: Federated Learning with Consensus-Oriented
Generation [52.82176415223988]
We propose federated learning with consensus-oriented generation (FedCOG)
FedCOG consists of two key components at the client side: complementary data generation and knowledge-distillation-based model training.
Experiments on classical and real-world FL datasets show that FedCOG consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T18:49:59Z) - Handling Data Heterogeneity via Architectural Design for Federated
Visual Recognition [16.50490537786593]
We study 19 visual recognition models from five different architectural families on four challenging FL datasets.
Our findings emphasize the importance of architectural design for computer vision tasks in practical scenarios.
arXiv Detail & Related papers (2023-10-23T17:59:16Z) - Filling the Missing: Exploring Generative AI for Enhanced Federated
Learning over Heterogeneous Mobile Edge Devices [72.61177465035031]
We propose a generative AI-empowered federated learning to address these challenges by leveraging the idea of FIlling the MIssing (FIMI) portion of local data.
Experiment results demonstrate that FIMI can save up to 50% of the device-side energy to achieve the target global test accuracy.
arXiv Detail & Related papers (2023-10-21T12:07:04Z) - FedConv: Enhancing Convolutional Neural Networks for Handling Data
Heterogeneity in Federated Learning [34.37155882617201]
Federated learning (FL) is an emerging paradigm in machine learning, where a shared model is collaboratively learned using data from multiple devices.
We systematically investigate the impact of different architectural elements, such as activation functions and normalization layers, on the performance within heterogeneous FL.
Our findings indicate that with strategic architectural modifications, pure CNNs can achieve a level of robustness that either matches or even exceeds that of ViTs.
arXiv Detail & Related papers (2023-10-06T17:57:50Z) - FS-Real: Towards Real-World Cross-Device Federated Learning [60.91678132132229]
Federated Learning (FL) aims to train high-quality models in collaboration with distributed clients while not uploading their local data.
There is still a considerable gap between the flourishing FL research and real-world scenarios, mainly caused by the characteristics of heterogeneous devices and its scales.
We propose an efficient and scalable prototyping system for real-world cross-device FL, FS-Real.
arXiv Detail & Related papers (2023-03-23T15:37:17Z) - FedHiSyn: A Hierarchical Synchronous Federated Learning Framework for
Resource and Data Heterogeneity [56.82825745165945]
Federated Learning (FL) enables training a global model without sharing the decentralized raw data stored on multiple devices to protect data privacy.
We propose a hierarchical synchronous FL framework, i.e., FedHiSyn, to tackle the problems of straggler effects and outdated models.
We evaluate the proposed framework based on MNIST, EMNIST, CIFAR10 and CIFAR100 datasets and diverse heterogeneous settings of devices.
arXiv Detail & Related papers (2022-06-21T17:23:06Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - On the Impact of Device and Behavioral Heterogeneity in Federated
Learning [5.038980064083677]
Federated learning (FL) is becoming a popular paradigm for collaborative learning over distributed, private datasets owned by non-trusting entities.
This paper describes the challenge of performing training over largely heterogeneous datasets, devices, and networks.
We conduct an empirical study spanning close to 1.5K unique configurations on five popular FL benchmarks.
arXiv Detail & Related papers (2021-02-15T12:04:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.