Differentially Private Vertical Federated Learning
- URL: http://arxiv.org/abs/2211.06782v1
- Date: Sun, 13 Nov 2022 01:24:38 GMT
- Title: Differentially Private Vertical Federated Learning
- Authors: Thilina Ranbaduge and Ming Ding
- Abstract summary: In this paper, we aim to explore how to protect the privacy of individual organisation data in a differential privacy (DP) setting.
Our results show that a trade-off point needs to be found to achieve a balance between the vertical FL performance and privacy protection.
- Score: 14.690310701654827
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A successful machine learning (ML) algorithm often relies on a large amount
of high-quality data to train well-performed models. Supervised learning
approaches, such as deep learning techniques, generate high-quality ML
functions for real-life applications, however with large costs and human
efforts to label training data. Recent advancements in federated learning (FL)
allow multiple data owners or organisations to collaboratively train a machine
learning model without sharing raw data. In this light, vertical FL allows
organisations to build a global model when the participating organisations have
vertically partitioned data. Further, in the vertical FL setting the
participating organisation generally requires fewer resources compared to
sharing data directly, enabling lightweight and scalable distributed training
solutions. However, privacy protection in vertical FL is challenging due to the
communication of intermediate outputs and the gradients of model update. This
invites adversary entities to infer other organisations underlying data. Thus,
in this paper, we aim to explore how to protect the privacy of individual
organisation data in a differential privacy (DP) setting. We run experiments
with different real-world datasets and DP budgets. Our experimental results
show that a trade-off point needs to be found to achieve a balance between the
vertical FL performance and privacy protection in terms of the amount of
perturbation noise.
Related papers
- CDFL: Efficient Federated Human Activity Recognition using Contrastive Learning and Deep Clustering [12.472038137777474]
Human Activity Recognition (HAR) is vital for the automation and intelligent identification of human actions through data from diverse sensors.
Traditional machine learning approaches by aggregating data on a central server and centralized processing are memory-intensive and raise privacy concerns.
This work proposes CDFL, an efficient federated learning framework for image-based HAR.
arXiv Detail & Related papers (2024-07-17T03:17:53Z) - Vertical Federated Learning Hybrid Local Pre-training [4.31644387824845]
We propose a novel VFL Hybrid Local Pre-training (VFLHLP) approach for Vertical Federated Learning (VFL)
VFLHLP first pre-trains local networks on the local data of participating parties.
Then it utilizes these pre-trained networks to adjust the sub-model for the labeled party or enhance representation learning for other parties during downstream federated learning on aligned data.
arXiv Detail & Related papers (2024-05-20T08:57:39Z) - A Survey on Efficient Federated Learning Methods for Foundation Model Training [62.473245910234304]
Federated Learning (FL) has become an established technique to facilitate privacy-preserving collaborative training across a multitude of clients.
In the wake of Foundation Models (FM), the reality is different for many deep learning applications.
We discuss the benefits and drawbacks of parameter-efficient fine-tuning (PEFT) for FL applications.
arXiv Detail & Related papers (2024-01-09T10:22:23Z) - Tunable Soft Prompts are Messengers in Federated Learning [55.924749085481544]
Federated learning (FL) enables multiple participants to collaboratively train machine learning models using decentralized data sources.
The lack of model privacy protection in FL becomes an unneglectable challenge.
We propose a novel FL training approach that accomplishes information exchange among participants via tunable soft prompts.
arXiv Detail & Related papers (2023-11-12T11:01:10Z) - Can Public Large Language Models Help Private Cross-device Federated Learning? [58.05449579773249]
We study (differentially) private federated learning (FL) of language models.
Public data has been used to improve privacy-utility trade-offs for both large and small language models.
We propose a novel distribution matching algorithm with theoretical grounding to sample public data close to private data distribution.
arXiv Detail & Related papers (2023-05-20T07:55:58Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z) - CoFED: Cross-silo Heterogeneous Federated Multi-task Learning via
Co-training [11.198612582299813]
Federated Learning (FL) is a machine learning technique that enables participants to train high-quality models collaboratively without exchanging their private data.
We propose a communication-efficient FL scheme, CoFED, based on pseudo-labeling unlabeled data like co-training.
Experimental results show that CoFED achieves better performance with a lower communication cost.
arXiv Detail & Related papers (2022-02-17T11:34:20Z) - Personalization Improves Privacy-Accuracy Tradeoffs in Federated
Optimization [57.98426940386627]
We show that coordinating local learning with private centralized learning yields a generically useful and improved tradeoff between accuracy and privacy.
We illustrate our theoretical results with experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2022-02-10T20:44:44Z) - Personalized Semi-Supervised Federated Learning for Human Activity
Recognition [1.9014535120129343]
We propose FedHAR: a novel hybrid method for human activities recognition.
FedHAR combines semi-supervised and federated learning.
We show that FedHAR reaches recognition rates and personalization capabilities similar to state-of-the-art FL supervised approaches.
arXiv Detail & Related papers (2021-04-15T10:24:18Z) - Privacy-Preserving Self-Taught Federated Learning for Heterogeneous Data [6.545317180430584]
Federated learning (FL) was proposed to enable joint training of a deep learning model using the local data in each party without revealing the data to others.
In this work, we propose an FL method called self-taught federated learning to address the aforementioned issues.
In this method, only latent variables are transmitted to other parties for model training, while privacy is preserved by storing the data and parameters of activations, weights, and biases locally.
arXiv Detail & Related papers (2021-02-11T08:07:51Z) - WAFFLe: Weight Anonymized Factorization for Federated Learning [88.44939168851721]
In domains where data are sensitive or private, there is great value in methods that can learn in a distributed manner without the data ever leaving the local devices.
We propose Weight Anonymized Factorization for Federated Learning (WAFFLe), an approach that combines the Indian Buffet Process with a shared dictionary of weight factors for neural networks.
arXiv Detail & Related papers (2020-08-13T04:26:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.