A Comprehensive Empirical Study of Bugs in Open-Source Federated
Learning Frameworks
- URL: http://arxiv.org/abs/2308.05014v2
- Date: Fri, 6 Oct 2023 09:04:19 GMT
- Title: A Comprehensive Empirical Study of Bugs in Open-Source Federated
Learning Frameworks
- Authors: Weijie Shao and Yuyang Gao and Fu Song and Sen Chen and Lingling Fan
and JingZhu He
- Abstract summary: Federated learning (FL) is a distributed machine learning (ML) paradigm, allowing multiple clients to collaboratively train (ML) models without exposing clients' data privacy.
To foster the application of FL, a variety of FL frameworks have been proposed, allowing non-experts to easily train ML models.
We conduct the first empirical study to comprehensively collect, taxonomize, and characterize bugs in FL frameworks.
- Score: 11.835104059182832
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is a distributed machine learning (ML) paradigm,
allowing multiple clients to collaboratively train shared machine learning (ML)
models without exposing clients' data privacy. It has gained substantial
popularity in recent years, especially since the enforcement of data protection
laws and regulations in many countries. To foster the application of FL, a
variety of FL frameworks have been proposed, allowing non-experts to easily
train ML models. As a result, understanding bugs in FL frameworks is critical
for facilitating the development of better FL frameworks and potentially
encouraging the development of bug detection, localization and repair tools.
Thus, we conduct the first empirical study to comprehensively collect,
taxonomize, and characterize bugs in FL frameworks. Specifically, we manually
collect and classify 1,119 bugs from all the 676 closed issues and 514 merged
pull requests in 17 popular and representative open-source FL frameworks on
GitHub. We propose a classification of those bugs into 12 bug symptoms, 12 root
causes, and 18 fix patterns. We also study their correlations and distributions
on 23 functionalities. We identify nine major findings from our study, discuss
their implications and future research directions based on our findings.
Related papers
- BLAZE: Cross-Language and Cross-Project Bug Localization via Dynamic Chunking and Hard Example Learning [1.9854146581797698]
BLAZE is an approach that employs dynamic chunking and hard example learning.
It fine-tunes a GPT-based model using challenging bug cases to enhance cross-project and cross-language bug localization.
BLAZE achieves up to an increase of 120% in Top 1 accuracy, 144% in Mean Average Precision (MAP), and 100% in Mean Reciprocal Rank (MRR)
arXiv Detail & Related papers (2024-07-24T20:44:36Z) - The Fact Selection Problem in LLM-Based Program Repair [3.7005619077967133]
We show that each fact, ranging from simple syntactic details like code context to semantic information previously unexplored in the context of Python projects, is beneficial.
Importantly, we discovered that the effectiveness of program repair prompts is non-monotonic over the number of used facts.
We develop a basic statistical model, named Maniple, which selects facts specific to a given bug to include in the prompt.
arXiv Detail & Related papers (2024-04-08T13:41:32Z) - A Survey on Efficient Federated Learning Methods for Foundation Model
Training [66.19763977571114]
Federated Learning (FL) has become an established technique to facilitate privacy-preserving collaborative training across a multitude of clients.
In the wake of Foundation Models (FM), the reality is different for many deep learning applications.
We discuss the benefits and drawbacks of parameter-efficient fine-tuning (PEFT) for FL applications.
arXiv Detail & Related papers (2024-01-09T10:22:23Z) - Deep Equilibrium Models Meet Federated Learning [71.57324258813675]
This study explores the problem of Federated Learning (FL) by utilizing the Deep Equilibrium (DEQ) models instead of conventional deep learning networks.
We claim that incorporating DEQ models into the federated learning framework naturally addresses several open problems in FL.
To the best of our knowledge, this study is the first to establish a connection between DEQ models and federated learning.
arXiv Detail & Related papers (2023-05-29T22:51:40Z) - FL Games: A Federated Learning Framework for Distribution Shifts [71.98708418753786]
Federated learning aims to train predictive models for data that is distributed across clients, under the orchestration of a server.
We propose FL GAMES, a game-theoretic framework for federated learning that learns causal features that are invariant across clients.
arXiv Detail & Related papers (2022-10-31T22:59:03Z) - Rethinking Normalization Methods in Federated Learning [92.25845185724424]
Federated learning (FL) is a popular distributed learning framework that can reduce privacy risks by not explicitly sharing private data.
We show that external covariate shifts will lead to the obliteration of some devices' contributions to the global model.
arXiv Detail & Related papers (2022-10-07T01:32:24Z) - UniFed: All-In-One Federated Learning Platform to Unify Open-Source
Frameworks [53.20176108643942]
We present UniFed, the first unified platform for standardizing open-source Federated Learning (FL) frameworks.
UniFed streamlines the end-to-end workflow for distributed experimentation and deployment, encompassing 11 popular open-source FL frameworks.
We evaluate and compare 11 popular FL frameworks from the perspectives of functionality, privacy protection, and performance.
arXiv Detail & Related papers (2022-07-21T05:03:04Z) - Federated Learning: Applications, Challenges and Future Scopes [1.3190581566723918]
Federated learning (FL) is a system in which a central aggregator coordinates the efforts of multiple clients to solve machine learning problems.
FL has applications in wireless communication, service recommendation, intelligent medical diagnosis systems, and healthcare.
arXiv Detail & Related papers (2022-05-18T10:47:09Z) - Federated Learning from Only Unlabeled Data with
Class-Conditional-Sharing Clients [98.22390453672499]
Supervised federated learning (FL) enables multiple clients to share the trained model without sharing their labeled data.
We propose federation of unsupervised learning (FedUL), where the unlabeled data are transformed into surrogate labeled data for each of the clients.
arXiv Detail & Related papers (2022-04-07T09:12:00Z) - Silent Bugs in Deep Learning Frameworks: An Empirical Study of Keras and
TensorFlow [13.260758930014154]
Deep Learning (DL) frameworks are now widely used, simplifying the creation of complex models as well as their integration to various applications even to non DL experts.
This paper deals with the subcategory of bugs named silent bugs: they lead to wrong behavior but they do not cause system crashes or hangs, nor show an error message to the user.
This paper presents the first empirical study of Keras and silent bugs, and their impact on users' programs.
arXiv Detail & Related papers (2021-12-26T04:18:57Z) - A Systematic Literature Review on Federated Learning: From A Model
Quality Perspective [10.725466627592732]
Federated Learning (FL) can jointly train a global model with the data remaining locally.
This paper systematically reviews and objectively analyzes the approaches to improving the quality of FL models.
arXiv Detail & Related papers (2020-12-01T05:48:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.