Fed-TGAN: Federated Learning Framework for Synthesizing Tabular Data
- URL: http://arxiv.org/abs/2108.07927v1
- Date: Wed, 18 Aug 2021 01:47:36 GMT
- Title: Fed-TGAN: Federated Learning Framework for Synthesizing Tabular Data
- Authors: Zilong Zhao, Robert Birke, Aditya Kunar, Lydia Y. Chen
- Abstract summary: We propose Fed-TGAN, the first Federated learning framework for Tabular GANs.
To effectively learn a complex GAN on non-identical participants, Fed-TGAN designs two novel features.
Results show that Fed-TGAN accelerates training time per epoch up to 200%.
- Score: 8.014848609114154
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) are typically trained to synthesize
data, from images and more recently tabular data, under the assumption of
directly accessible training data. Recently, federated learning (FL) is an
emerging paradigm that features decentralized learning on client's local data
with a privacy-preserving capability. And, while learning GANs to synthesize
images on FL systems has just been demonstrated, it is unknown if GANs for
tabular data can be learned from decentralized data sources. Moreover, it
remains unclear which distributed architecture suits them best. Different from
image GANs, state-of-the-art tabular GANs require prior knowledge on the data
distribution of each (discrete and continuous) column to agree on a common
encoding -- risking privacy guarantees. In this paper, we propose Fed-TGAN, the
first Federated learning framework for Tabular GANs. To effectively learn a
complex tabular GAN on non-identical participants, Fed-TGAN designs two novel
features: (i) a privacy-preserving multi-source feature encoding for model
initialization; and (ii) table similarity aware weighting strategies to
aggregate local models for countering data skew. We extensively evaluate the
proposed Fed-TGAN against variants of decentralized learning architectures on
four widely used datasets. Results show that Fed-TGAN accelerates training time
per epoch up to 200% compared to the alternative architectures, for both IID
and Non-IID data. Overall, Fed-TGAN not only stabilizes the training loss, but
also achieves better similarity between generated and original data.
Related papers
- An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - FLIGAN: Enhancing Federated Learning with Incomplete Data using GAN [1.5749416770494706]
Federated Learning (FL) provides a privacy-preserving mechanism for distributed training of machine learning models on networked devices.
We propose FLIGAN, a novel approach to address the issue of data incompleteness in FL.
Our methodology adheres to FL's privacy requirements by generating synthetic data in a federated manner without sharing the actual data in the process.
arXiv Detail & Related papers (2024-03-25T16:49:38Z) - Fake It Till Make It: Federated Learning with Consensus-Oriented
Generation [52.82176415223988]
We propose federated learning with consensus-oriented generation (FedCOG)
FedCOG consists of two key components at the client side: complementary data generation and knowledge-distillation-based model training.
Experiments on classical and real-world FL datasets show that FedCOG consistently outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-12-10T18:49:59Z) - Federated Learning Empowered by Generative Content [55.576885852501775]
Federated learning (FL) enables leveraging distributed private data for model training in a privacy-preserving way.
We propose a novel FL framework termed FedGC, designed to mitigate data heterogeneity issues by diversifying private data with generative content.
We conduct a systematic empirical study on FedGC, covering diverse baselines, datasets, scenarios, and modalities.
arXiv Detail & Related papers (2023-12-10T07:38:56Z) - PS-FedGAN: An Efficient Federated Learning Framework Based on Partially
Shared Generative Adversarial Networks For Data Privacy [56.347786940414935]
Federated Learning (FL) has emerged as an effective learning paradigm for distributed computation.
This work proposes a novel FL framework that requires only partial GAN model sharing.
Named as PS-FedGAN, this new framework enhances the GAN releasing and training mechanism to address heterogeneous data distributions.
arXiv Detail & Related papers (2023-05-19T05:39:40Z) - GTV: Generating Tabular Data via Vertical Federated Learning [20.683314367860532]
We propose GTV, a VFL framework for Generative Adversarial Networks (GANs)
GTV proposes an unique distributed training architecture for generator and discriminator to access training data in a privacy-preserving manner.
Results show that GTV can consistently generate high-fidelity synthetic data of comparable quality to that generated by centralized GAN algorithm.
arXiv Detail & Related papers (2023-02-03T13:04:12Z) - Rethinking Data Heterogeneity in Federated Learning: Introducing a New
Notion and Standard Benchmarks [65.34113135080105]
We show that not only the issue of data heterogeneity in current setups is not necessarily a problem but also in fact it can be beneficial for the FL participants.
Our observations are intuitive.
Our code is available at https://github.com/MMorafah/FL-SC-NIID.
arXiv Detail & Related papers (2022-09-30T17:15:19Z) - Federated Learning with GAN-based Data Synthesis for Non-IID Clients [8.304185807036783]
Federated learning (FL) has recently emerged as a popular privacy-preserving collaborative learning paradigm.
We propose a novel framework, named Synthetic Data Aided Federated Learning (SDA-FL), to resolve this non-IID challenge by sharing synthetic data.
arXiv Detail & Related papers (2022-06-11T11:43:25Z) - Heterogeneous Ensemble Knowledge Transfer for Training Large Models in
Federated Learning [22.310090483499035]
Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server.
Most existing FL algorithms require models of identical architecture to be deployed across the clients and server.
We propose a novel ensemble knowledge transfer method named Fed-ET in which small models are trained on clients, and used to train a larger model at the server.
arXiv Detail & Related papers (2022-04-27T05:18:32Z) - FedH2L: Federated Learning with Model and Statistical Heterogeneity [75.61234545520611]
Federated learning (FL) enables distributed participants to collectively learn a strong global model without sacrificing their individual data privacy.
We introduce FedH2L, which is agnostic to both the model architecture and robust to different data distributions across participants.
In contrast to approaches sharing parameters or gradients, FedH2L relies on mutual distillation, exchanging only posteriors on a shared seed set between participants in a decentralized manner.
arXiv Detail & Related papers (2021-01-27T10:10:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.