Using Synthetic Data to Mitigate Unfairness and Preserve Privacy through Single-Shot Federated Learning
- URL: http://arxiv.org/abs/2409.09532v1
- Date: Sat, 14 Sep 2024 21:04:11 GMT
- Title: Using Synthetic Data to Mitigate Unfairness and Preserve Privacy through Single-Shot Federated Learning
- Authors: Chia-Yuan Wu, Frank E. Curtis, Daniel P. Robinson,
- Abstract summary: We propose a strategy that promotes fair predictions across clients without the need to pass information between the clients and server.
We then pass each client's synthetic dataset to the server, the collection of which is used to train the server model.
- Score: 6.516872951510096
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To address unfairness issues in federated learning (FL), contemporary approaches typically use frequent model parameter updates and transmissions between the clients and server. In such a process, client-specific information (e.g., local dataset size or data-related fairness metrics) must be sent to the server to compute, e.g., aggregation weights. All of this results in high transmission costs and the potential leakage of client information. As an alternative, we propose a strategy that promotes fair predictions across clients without the need to pass information between the clients and server iteratively and prevents client data leakage. For each client, we first use their local dataset to obtain a synthetic dataset by solving a bilevel optimization problem that addresses unfairness concerns during the learning process. We then pass each client's synthetic dataset to the server, the collection of which is used to train the server model using conventional machine learning techniques (that do not take fairness metrics into account). Thus, we eliminate the need to handle fairness-specific aggregation weights while preserving client privacy. Our approach requires only a single communication between the clients and the server, thus making it computationally cost-effective, able to maintain privacy, and able to ensuring fairness. We present empirical evidence to demonstrate the advantages of our approach. The results illustrate that our method effectively uses synthetic data as a means to mitigate unfairness and preserve client privacy.
Related papers
- Fair Federated Data Clustering through Personalization: Bridging the Gap between Diverse Data Distributions [2.7905216619150344]
We introduce the idea of personalization in federated clustering. The goal is achieve balance between achieving lower clustering cost and at same time achieving uniform cost across clients.
We propose p-FClus that addresses these goal in a single round of communication between server and clients.
arXiv Detail & Related papers (2024-07-05T07:10:26Z) - Personalized federated learning based on feature fusion [2.943623084019036]
Federated learning enables distributed clients to collaborate on training while storing their data locally to protect client privacy.
We propose a personalized federated learning approach called pFedPM.
In our process, we replace traditional gradient uploading with feature uploading, which helps reduce communication costs and allows for heterogeneous client models.
arXiv Detail & Related papers (2024-06-24T12:16:51Z) - Efficient Cross-Domain Federated Learning by MixStyle Approximation [0.3277163122167433]
We introduce a privacy-preserving, resource-efficient Federated Learning concept for client adaptation in hardware-constrained environments.
Our approach includes server model pre-training on source data and subsequent fine-tuning on target data via low-end clients.
Preliminary results indicate that our method reduces computational and transmission costs while maintaining competitive performance on downstream tasks.
arXiv Detail & Related papers (2023-12-12T08:33:34Z) - Utilizing Free Clients in Federated Learning for Focused Model
Enhancement [9.370655190768163]
Federated Learning (FL) is a distributed machine learning approach to learn models on decentralized heterogeneous data.
We present FedALIGN (Federated Adaptive Learning with Inclusion of Global Needs) to address this challenge.
arXiv Detail & Related papers (2023-10-06T18:23:40Z) - DYNAFED: Tackling Client Data Heterogeneity with Global Dynamics [60.60173139258481]
Local training on non-iid distributed data results in deflected local optimum.
A natural solution is to gather all client data onto the server, such that the server has a global view of the entire data distribution.
In this paper, we put forth an idea to collect and leverage global knowledge on the server without hindering data privacy.
arXiv Detail & Related papers (2022-11-20T06:13:06Z) - Optimizing Server-side Aggregation For Robust Federated Learning via
Subspace Training [80.03567604524268]
Non-IID data distribution across clients and poisoning attacks are two main challenges in real-world federated learning systems.
We propose SmartFL, a generic approach that optimize the server-side aggregation process.
We provide theoretical analyses of the convergence and generalization capacity for SmartFL.
arXiv Detail & Related papers (2022-11-10T13:20:56Z) - Straggler-Resilient Personalized Federated Learning [55.54344312542944]
Federated learning allows training models from samples distributed across a large network of clients while respecting privacy and communication restrictions.
We develop a novel algorithmic procedure with theoretical speedup guarantees that simultaneously handles two of these hurdles.
Our method relies on ideas from representation learning theory to find a global common representation using all clients' data and learn a user-specific set of parameters leading to a personalized solution for each client.
arXiv Detail & Related papers (2022-06-05T01:14:46Z) - Federated Multi-Target Domain Adaptation [99.93375364579484]
Federated learning methods enable us to train machine learning models on distributed user data while preserving its privacy.
We consider a more practical scenario where the distributed client data is unlabeled, and a centralized labeled dataset is available on the server.
We propose an effective DualAdapt method to address the new challenges.
arXiv Detail & Related papers (2021-08-17T17:53:05Z) - Towards Fair Federated Learning with Zero-Shot Data Augmentation [123.37082242750866]
Federated learning has emerged as an important distributed learning paradigm, where a server aggregates a global model from many client-trained models while having no access to the client data.
We propose a novel federated learning system that employs zero-shot data augmentation on under-represented data to mitigate statistical heterogeneity and encourage more uniform accuracy performance across clients in federated networks.
We study two variants of this scheme, Fed-ZDAC (federated learning with zero-shot data augmentation at the clients) and Fed-ZDAS (federated learning with zero-shot data augmentation at the server).
arXiv Detail & Related papers (2021-04-27T18:23:54Z) - Toward Understanding the Influence of Individual Clients in Federated
Learning [52.07734799278535]
Federated learning allows clients to jointly train a global model without sending their private data to a central server.
We defined a new notion called em-Influence, quantify this influence over parameters, and proposed an effective efficient model to estimate this metric.
arXiv Detail & Related papers (2020-12-20T14:34:36Z) - Shuffled Model of Federated Learning: Privacy, Communication and
Accuracy Trade-offs [30.58690911428577]
We consider a distributed empirical risk minimization (ERM) optimization problem with communication efficiency and privacy requirements.
We develop (optimal) communication-efficient schemes for private mean estimation for several $ell_p$ spaces.
We demonstrate that one can get the same privacy, optimization-performance operating point developed in recent methods that use full-precision communication.
arXiv Detail & Related papers (2020-08-17T09:41:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.