Related papers: Mitigating Bias in Federated Learning

Mitigating Bias in Federated Learning

URL: http://arxiv.org/abs/2012.02447v1
Date: Fri, 4 Dec 2020 08:04:12 GMT
Title: Mitigating Bias in Federated Learning
Authors: Annie Abay, Yi Zhou, Nathalie Baracaldo, Shashank Rajamoni, Ebube Chuba, Heiko Ludwig
Abstract summary: In this paper, we discuss causes of bias in federated learning (FL) We propose three pre-processing and in-processing methods to mitigate bias, without compromising data privacy. We conduct experiments over several data distributions to analyze their effects on model performance, fairness metrics, and bias learning patterns.
Score: 9.295028968787351
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As methods to create discrimination-aware models develop, they focus on centralized ML, leaving federated learning (FL) unexplored. FL is a rising approach for collaborative ML, in which an aggregator orchestrates multiple parties to train a global model without sharing their training data. In this paper, we discuss causes of bias in FL and propose three pre-processing and in-processing methods to mitigate bias, without compromising data privacy, a key FL requirement. As data heterogeneity among parties is one of the challenging characteristics of FL, we conduct experiments over several data distributions to analyze their effects on model performance, fairness metrics, and bias learning patterns. We conduct a comprehensive analysis of our proposed techniques, the results demonstrating that these methods are effective even when parties have skewed data distributions or as little as 20% of parties employ the methods.

Related papers

Harmonizing Generalization and Personalization in Ring-topology Decentralized Federated Learning [41.4210010333948]
We introduce Ring-topology Decentralized Federated Learning (RDFL) for distributed model training, aiming to avoid the inherent risks of centralized failure in server-based FL.<n>RDFL faces the challenge of low information-sharing efficiency due to the point-to-point communication manner when handling inherent data heterogeneity.<n>We propose a Divide-and-conquer RDFL framework (DRDFL) that uses a feature generation model to extract personalized information and invariant shared knowledge from the underlying data distribution.
arXiv Detail & Related papers (2025-04-27T04:38:49Z)
Distributionally Robust Federated Learning: An ADMM Algorithm [5.65425489838679]
Federated learning (FL) aims to train machine learning (ML) models collaboratively using decentralized data. Standard FL models often assume that all data come from the same unknown distribution. We propose a novel FL model, Distributionally Robust Federated Learning (DRFL), that applies distributionally robust optimization to overcome the challenges posed by data heterogeneity and distributional ambiguity.
arXiv Detail & Related papers (2025-03-24T08:35:38Z)
Client Contribution Normalization for Enhanced Federated Learning [4.726250115737579]
Mobile devices, including smartphones and laptops, generate decentralized and heterogeneous data. Federated Learning (FL) offers a promising alternative by enabling collaborative training of a global model across decentralized devices without data sharing. This paper focuses on data-dependent heterogeneity in FL and proposes a novel approach leveraging mean latent representations extracted from locally trained models.
arXiv Detail & Related papers (2024-11-10T04:03:09Z)
Algorithms for Collaborative Machine Learning under Statistical Heterogeneity [1.8130068086063336]
Federated learning is currently the de facto standard of training a machine learning model across heterogeneous data owners. In this dissertation, three major factors can be considered as starting points -- textit parameter, textitmixing coefficient, and textitlocal data distributions.
arXiv Detail & Related papers (2024-07-31T16:32:34Z)
FedSym: Unleashing the Power of Entropy for Benchmarking the Algorithms for Federated Learning [1.4656078321003647]
Federated learning (FL) is a decentralized machine learning approach where independent learners process data privately. We study the currently popular data partitioning techniques and visualize their main disadvantages. We propose a method that leverages entropy and symmetry to construct 'the most challenging' and controllable data distributions.
arXiv Detail & Related papers (2023-10-11T18:39:08Z)
Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning [60.41501515192088]
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively. The data samples usually follow a long-tailed distribution in the real world, and FL on the decentralized and long-tailed data yields a poorly-behaved global model. In this work, we integrate the local real data with the global gradient prototypes to form the local balanced datasets.
arXiv Detail & Related papers (2023-01-25T03:18:10Z)
Depersonalized Federated Learning: Tackling Statistical Heterogeneity by Alternating Stochastic Gradient Descent [6.394263208820851]
Federated learning (FL) enables devices to train a common machine learning (ML) model for intelligent inference without data sharing. Raw data held by various cooperativelyicipators are always non-identically distributedly. We propose a new FL that can significantly statistical optimize by the de-speed of this process.
arXiv Detail & Related papers (2022-10-07T10:30:39Z)
FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry. We propose FedDM to build the global training objective from multiple local surrogate functions. In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z)
DRFLM: Distributionally Robust Federated Learning with Inter-client Noise via Local Mixup [58.894901088797376]
federated learning has emerged as a promising approach for training a global model using data from multiple organizations without leaking their raw data. We propose a general framework to solve the above two challenges simultaneously. We provide comprehensive theoretical analysis including robustness analysis, convergence analysis, and generalization ability.
arXiv Detail & Related papers (2022-04-16T08:08:29Z)
Local Learning Matters: Rethinking Data Heterogeneity in Federated Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z)
Learning Bias-Invariant Representation by Cross-Sample Mutual Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task. The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator. We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z)
Privacy-Preserving Self-Taught Federated Learning for Heterogeneous Data [6.545317180430584]
Federated learning (FL) was proposed to enable joint training of a deep learning model using the local data in each party without revealing the data to others. In this work, we propose an FL method called self-taught federated learning to address the aforementioned issues. In this method, only latent variables are transmitted to other parties for model training, while privacy is preserved by storing the data and parameters of activations, weights, and biases locally.
arXiv Detail & Related papers (2021-02-11T08:07:51Z)
FedH2L: Federated Learning with Model and Statistical Heterogeneity [75.61234545520611]
Federated learning (FL) enables distributed participants to collectively learn a strong global model without sacrificing their individual data privacy. We introduce FedH2L, which is agnostic to both the model architecture and robust to different data distributions across participants. In contrast to approaches sharing parameters or gradients, FedH2L relies on mutual distillation, exchanging only posteriors on a shared seed set between participants in a decentralized manner.
arXiv Detail & Related papers (2021-01-27T10:10:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.