How Much Privacy Does Federated Learning with Secure Aggregation
Guarantee?
- URL: http://arxiv.org/abs/2208.02304v1
- Date: Wed, 3 Aug 2022 18:44:17 GMT
- Title: How Much Privacy Does Federated Learning with Secure Aggregation
Guarantee?
- Authors: Ahmed Roushdy Elkordy, Jiang Zhang, Yahya H. Ezzeldin, Konstantinos
Psounis, Salman Avestimehr
- Abstract summary: Federated learning (FL) has attracted growing interest for enabling privacy-preserving machine learning on data stored at multiple users.
While data never leaves users' devices, privacy still cannot be guaranteed since significant computations on users' training data are shared in the form of trained local models.
Secure Aggregation (SA) has been developed as a framework to preserve privacy in FL.
- Score: 22.7443077369789
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning (FL) has attracted growing interest for enabling
privacy-preserving machine learning on data stored at multiple users while
avoiding moving the data off-device. However, while data never leaves users'
devices, privacy still cannot be guaranteed since significant computations on
users' training data are shared in the form of trained local models. These
local models have recently been shown to pose a substantial privacy threat
through different privacy attacks such as model inversion attacks. As a remedy,
Secure Aggregation (SA) has been developed as a framework to preserve privacy
in FL, by guaranteeing the server can only learn the global aggregated model
update but not the individual model updates. While SA ensures no additional
information is leaked about the individual model update beyond the aggregated
model update, there are no formal guarantees on how much privacy FL with SA can
actually offer; as information about the individual dataset can still
potentially leak through the aggregated model computed at the server. In this
work, we perform a first analysis of the formal privacy guarantees for FL with
SA. Specifically, we use Mutual Information (MI) as a quantification metric and
derive upper bounds on how much information about each user's dataset can leak
through the aggregated model update. When using the FedSGD aggregation
algorithm, our theoretical bounds show that the amount of privacy leakage
reduces linearly with the number of users participating in FL with SA. To
validate our theoretical bounds, we use an MI Neural Estimator to empirically
evaluate the privacy leakage under different FL setups on both the MNIST and
CIFAR10 datasets. Our experiments verify our theoretical bounds for FedSGD,
which show a reduction in privacy leakage as the number of users and local
batch size grow, and an increase in privacy leakage with the number of training
rounds.
Related papers
- FT-PrivacyScore: Personalized Privacy Scoring Service for Machine Learning Participation [4.772368796656325]
In practice, controlled data access remains a mainstream method for protecting data privacy in many industrial and research environments.
We developed the demo prototype FT-PrivacyScore to show that it's possible to efficiently and quantitatively estimate the privacy risk of participating in a model fine-tuning task.
arXiv Detail & Related papers (2024-10-30T02:41:26Z) - DMM: Distributed Matrix Mechanism for Differentially-Private Federated Learning using Packed Secret Sharing [51.336015600778396]
Federated Learning (FL) has gained lots of traction recently, both in industry and academia.
In FL, a machine learning model is trained using data from various end-users arranged in committees across several rounds.
Since such data can often be sensitive, a primary challenge in FL is providing privacy while still retaining utility of the model.
arXiv Detail & Related papers (2024-10-21T16:25:14Z) - PriRoAgg: Achieving Robust Model Aggregation with Minimum Privacy Leakage for Federated Learning [49.916365792036636]
Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data.
The transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates.
We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy.
arXiv Detail & Related papers (2024-07-12T03:18:08Z) - Differentially Private Federated Learning without Noise Addition: When is it Possible? [16.49898177547646]
Federated Learning with Secure Aggregation (SA) has gained significant attention as a privacy preserving framework for training machine learning models.
Recent research has extended privacy guarantees of FL with SA by bounding the information leakage through the aggregate model over multiple training rounds.
We study the conditions under which FL with SA can provide worst-case differential privacy guarantees.
arXiv Detail & Related papers (2024-05-06T03:19:24Z) - FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering [2.2194815687410627]
We show how a malicious client can leak the privacy-sensitive data of some other users in FL even without any cooperation from the server.
Our best-performing method improves the membership inference recall by 29% and achieves up to 71% private data reconstruction.
arXiv Detail & Related papers (2023-10-24T19:50:01Z) - Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - Do Gradient Inversion Attacks Make Federated Learning Unsafe? [70.0231254112197]
Federated learning (FL) allows the collaborative training of AI models without needing to share raw data.
Recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data.
In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack.
arXiv Detail & Related papers (2022-02-14T18:33:12Z) - BEAS: Blockchain Enabled Asynchronous & Secure Federated Machine
Learning [0.0]
We present BEAS, the first blockchain-based framework for N-party Federated Learning.
It provides strict privacy guarantees of training data using gradient pruning.
Anomaly detection protocols are used to minimize the risk of data-poisoning attacks.
We also define a novel protocol to prevent premature convergence in heterogeneous learning environments.
arXiv Detail & Related papers (2022-02-06T17:11:14Z) - Robbing the Fed: Directly Obtaining Private Data in Federated Learning
with Modified Models [56.0250919557652]
Federated learning has quickly gained popularity with its promises of increased user privacy and efficiency.
Previous attacks on user privacy have been limited in scope and do not scale to gradient updates aggregated over even a handful of data points.
We introduce a new threat model based on minimal but malicious modifications of the shared model architecture.
arXiv Detail & Related papers (2021-10-25T15:52:06Z) - LDP-FL: Practical Private Aggregation in Federated Learning with Local
Differential Privacy [20.95527613004989]
Federated learning is a popular approach for privacy protection that collects the local gradient information instead of real data.
Previous works do not give a practical solution due to three issues.
Last, the privacy budget explodes due to the high dimensionality of weights in deep learning models.
arXiv Detail & Related papers (2020-07-31T01:08:57Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.