Quantifying Information Leakage from Gradients
- URL: http://arxiv.org/abs/2105.13929v1
- Date: Fri, 28 May 2021 15:47:44 GMT
- Title: Quantifying Information Leakage from Gradients
- Authors: Fan Mo, Anastasia Borovykh, Mohammad Malekzadeh, Hamed Haddadi,
Soteris Demetriou
- Abstract summary: Sharing deep neural networks' gradients instead of training data could facilitate data privacy in collaborative learning.
In practice however, gradients can disclose both private latent attributes and original data.
Mathematical metrics are needed to quantify both original and latent information leakages from gradients computed over the training data.
- Score: 8.175697239083474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sharing deep neural networks' gradients instead of training data could
facilitate data privacy in collaborative learning. In practice however,
gradients can disclose both private latent attributes and original data.
Mathematical metrics are needed to quantify both original and latent
information leakages from gradients computed over the training data. In this
work, we first use an adaptation of the empirical $\mathcal{V}$-information to
present an information-theoretic justification for the attack success rates in
a layer-wise manner. We then move towards a deeper understanding of gradient
leakages and propose more general and efficient metrics, using sensitivity and
subspace distance to quantify the gradient changes w.r.t. original and latent
information, respectively. Our empirical results, on six datasets and four
models, reveal that gradients of the first layers contain the highest amount of
original information, while the classifier/fully-connected layers placed after
the feature extractor contain the highest latent information. Further, we show
how training hyperparameters such as gradient aggregation can decrease
information leakages. Our characterization provides a new understanding on
gradient-based information leakages using the gradients' sensitivity w.r.t.
changes in private information, and portends possible defenses such as
layer-based protection or strong aggregation.
Related papers
- R-CONV: An Analytical Approach for Efficient Data Reconstruction via Convolutional Gradients [40.209183669098735]
This paper introduces an advanced data leakage method to efficiently exploit convolutional layers' gradients.
To the best of our knowledge, this is the first analytical approach that successfully reconstructs convolutional layer inputs directly from the gradients.
arXiv Detail & Related papers (2024-06-06T16:28:04Z) - How to guess a gradient [68.98681202222664]
We show that gradients are more structured than previously thought.
Exploiting this structure can significantly improve gradient-free optimization schemes.
We highlight new challenges in overcoming the large gap between optimizing with exact gradients and guessing the gradients.
arXiv Detail & Related papers (2023-12-07T21:40:44Z) - Reconstructing Training Data from Model Gradient, Provably [68.21082086264555]
We reconstruct the training samples from a single gradient query at a randomly chosen parameter value.
As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy.
arXiv Detail & Related papers (2022-12-07T15:32:22Z) - Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data [63.34506218832164]
In this work, we investigate the implicit bias of gradient flow and gradient descent in two-layer fully-connected neural networks with ReLU activations.
For gradient flow, we leverage recent work on the implicit bias for homogeneous neural networks to show that leakyally, gradient flow produces a neural network with rank at most two.
For gradient descent, provided the random variance is small enough, we show that a single step of gradient descent suffices to drastically reduce the rank of the network, and that the rank remains small throughout training.
arXiv Detail & Related papers (2022-10-13T15:09:54Z) - The Manifold Hypothesis for Gradient-Based Explanations [55.01671263121624]
gradient-based explanation algorithms provide perceptually-aligned explanations.
We show that the more a feature attribution is aligned with the tangent space of the data, the more perceptually-aligned it tends to be.
We suggest that explanation algorithms should actively strive to align their explanations with the data manifold.
arXiv Detail & Related papers (2022-06-15T08:49:24Z) - Auditing Privacy Defenses in Federated Learning via Generative Gradient
Leakage [9.83989883339971]
Federated Learning (FL) framework brings privacy benefits to distributed learning systems.
Recent studies have revealed that private information can still be leaked through shared information.
We propose a new type of leakage, i.e., Generative Gradient Leakage (GGL)
arXiv Detail & Related papers (2022-03-29T15:59:59Z) - Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection.
First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results.
We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z) - Understanding Training-Data Leakage from Gradients in Neural Networks
for Image Classification [11.272188531829016]
In many applications, we need to protect the training data from being leaked due to IP or privacy concerns.
Recent works have demonstrated that it is possible to reconstruct the training data from gradients for an image-classification model when its architecture is known.
We formulate the problem of training data reconstruction as solving an optimisation problem iteratively for each layer.
We are able to attribute the potential leakage of the training data in a deep network to its architecture.
arXiv Detail & Related papers (2021-11-19T12:14:43Z) - Large Scale Private Learning via Low-rank Reparametrization [77.38947817228656]
We propose a reparametrization scheme to address the challenges of applying differentially private SGD on large neural networks.
We are the first able to apply differential privacy on the BERT model and achieve an average accuracy of $83.9%$ on four downstream tasks.
arXiv Detail & Related papers (2021-06-17T10:14:43Z) - A Quantitative Metric for Privacy Leakage in Federated Learning [22.968763654455298]
We propose a quantitative metric based on mutual information for clients to evaluate the potential risk of information leakage in their gradients.
It is proven that, the risk of information leakage is related to the status of the task model, as well as the inherent data distribution.
arXiv Detail & Related papers (2021-02-24T02:48:35Z) - Layer-wise Characterization of Latent Information Leakage in Federated
Learning [9.397152006395174]
Training deep neural networks via federated learning allows clients to share, instead of the original data, only the model trained on their data.
Prior work has demonstrated that in practice a client's private information, unrelated to the main learning task, can be discovered from the model's gradients.
There is still no formal approach for quantifying the leakage of private information via the shared updated model or gradients.
arXiv Detail & Related papers (2020-10-17T10:49:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.