A Quantitative Metric for Privacy Leakage in Federated Learning
- URL: http://arxiv.org/abs/2102.13472v1
- Date: Wed, 24 Feb 2021 02:48:35 GMT
- Title: A Quantitative Metric for Privacy Leakage in Federated Learning
- Authors: Yong Liu, Xinghua Zhu, Jianzong Wang, Jing Xiao
- Abstract summary: We propose a quantitative metric based on mutual information for clients to evaluate the potential risk of information leakage in their gradients.
It is proven that, the risk of information leakage is related to the status of the task model, as well as the inherent data distribution.
- Score: 22.968763654455298
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the federated learning system, parameter gradients are shared among
participants and the central modulator, while the original data never leave
their protected source domain. However, the gradient itself might carry enough
information for precise inference of the original data. By reporting their
parameter gradients to the central server, client datasets are exposed to
inference attacks from adversaries. In this paper, we propose a quantitative
metric based on mutual information for clients to evaluate the potential risk
of information leakage in their gradients. Mutual information has received
increasing attention in the machine learning and data mining community over the
past few years. However, existing mutual information estimation methods cannot
handle high-dimensional variables. In this paper, we propose a novel method to
approximate the mutual information between the high-dimensional gradients and
batched input data. Experimental results show that the proposed metric reliably
reflect the extent of information leakage in federated learning. In addition,
using the proposed metric, we investigate the influential factors of risk
level. It is proven that, the risk of information leakage is related to the
status of the task model, as well as the inherent data distribution.
Related papers
- Defending Against Data Reconstruction Attacks in Federated Learning: An
Information Theory Approach [21.03960608358235]
Federated Learning (FL) trains a black-box and high-dimensional model among different clients by exchanging parameters instead of direct data sharing.
FL still suffers from membership inference attacks (MIA) or data reconstruction attacks (DRA)
arXiv Detail & Related papers (2024-03-02T17:12:32Z) - Client-side Gradient Inversion Against Federated Learning from Poisoning [59.74484221875662]
Federated Learning (FL) enables distributed participants to train a global model without sharing data directly to a central server.
Recent studies have revealed that FL is vulnerable to gradient inversion attack (GIA), which aims to reconstruct the original training samples.
We propose Client-side poisoning Gradient Inversion (CGI), which is a novel attack method that can be launched from clients.
arXiv Detail & Related papers (2023-09-14T03:48:27Z) - Federated Stochastic Gradient Descent Begets Self-Induced Momentum [151.4322255230084]
Federated learning (FL) is an emerging machine learning method that can be applied in mobile edge systems.
We show that running to the gradient descent (SGD) in such a setting can be viewed as adding a momentum-like term to the global aggregation process.
arXiv Detail & Related papers (2022-02-17T02:01:37Z) - Do Gradient Inversion Attacks Make Federated Learning Unsafe? [70.0231254112197]
Federated learning (FL) allows the collaborative training of AI models without needing to share raw data.
Recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data.
In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack.
arXiv Detail & Related papers (2022-02-14T18:33:12Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Quantifying Information Leakage from Gradients [8.175697239083474]
Sharing deep neural networks' gradients instead of training data could facilitate data privacy in collaborative learning.
In practice however, gradients can disclose both private latent attributes and original data.
Mathematical metrics are needed to quantify both original and latent information leakages from gradients computed over the training data.
arXiv Detail & Related papers (2021-05-28T15:47:44Z) - Detecting discriminatory risk through data annotation based on Bayesian
inferences [5.017973966200985]
We propose a method of data annotation that aims to warn about the risk of discriminatory results of a given data set.
We empirically test our system on three datasets commonly accessed by the machine learning community.
arXiv Detail & Related papers (2021-01-27T12:43:42Z) - Layer-wise Characterization of Latent Information Leakage in Federated
Learning [9.397152006395174]
Training deep neural networks via federated learning allows clients to share, instead of the original data, only the model trained on their data.
Prior work has demonstrated that in practice a client's private information, unrelated to the main learning task, can be discovered from the model's gradients.
There is still no formal approach for quantifying the leakage of private information via the shared updated model or gradients.
arXiv Detail & Related papers (2020-10-17T10:49:14Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z) - WAFFLe: Weight Anonymized Factorization for Federated Learning [88.44939168851721]
In domains where data are sensitive or private, there is great value in methods that can learn in a distributed manner without the data ever leaving the local devices.
We propose Weight Anonymized Factorization for Federated Learning (WAFFLe), an approach that combines the Indian Buffet Process with a shared dictionary of weight factors for neural networks.
arXiv Detail & Related papers (2020-08-13T04:26:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.