Related papers: Analysing Training-Data Leakage from Gradients through Linear Systems and Gradient Matching

Analysing Training-Data Leakage from Gradients through Linear Systems and Gradient Matching

URL: http://arxiv.org/abs/2210.13231v1
Date: Thu, 20 Oct 2022 08:53:20 GMT
Title: Analysing Training-Data Leakage from Gradients through Linear Systems and Gradient Matching
Authors: Cangxiong Chen, Neill D. F. Campbell
Abstract summary: We propose a novel framework to analyse training-data leakage from gradients. We draw insights from both analytic and optimisation-based gradient-leakage attacks. We also propose a metric to measure the level of security of a deep learning model against gradient-based attacks.
Score: 8.071506311915396
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent works have demonstrated that it is possible to reconstruct training images and their labels from gradients of an image-classification model when its architecture is known. Unfortunately, there is still an incomplete theoretical understanding of the efficacy and failure of these gradient-leakage attacks. In this paper, we propose a novel framework to analyse training-data leakage from gradients that draws insights from both analytic and optimisation-based gradient-leakage attacks. We formulate the reconstruction problem as solving a linear system from each layer iteratively, accompanied by corrections using gradient matching. Under this framework, we claim that the solubility of the reconstruction problem is primarily determined by that of the linear system at each layer. As a result, we are able to partially attribute the leakage of the training data in a deep network to its architecture. We also propose a metric to measure the level of security of a deep learning model against gradient-based attacks on the training data.

Related papers

Gradient Inversion Transcript: Leveraging Robust Generative Priors to Reconstruct Training Data from Gradient Leakage [3.012404329139943]
Gradient Inversion Transcript (GIT) is a novel generative approach for reconstructing training data from leaked gradients.<n>GIT consistently outperforms existing methods across multiple datasets.
arXiv Detail & Related papers (2025-05-26T14:17:00Z)
R-CONV: An Analytical Approach for Efficient Data Reconstruction via Convolutional Gradients [40.209183669098735]
This paper introduces an advanced data leakage method to efficiently exploit convolutional layers' gradients. To the best of our knowledge, this is the first analytical approach that successfully reconstructs convolutional layer inputs directly from the gradients.
arXiv Detail & Related papers (2024-06-06T16:28:04Z)
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation [110.61853418925219]
We build a stronger version of the dataset reconstruction attack and show how it can provably recover the emphentire training set in the infinite width regime. We show that both theoretically and empirically, reconstructed images tend to "outliers" in the dataset. These reconstruction attacks can be used for textitdataset distillation, that is, we can retrain on reconstructed images and obtain high predictive accuracy.
arXiv Detail & Related papers (2023-02-02T21:41:59Z)
Reconstructing Training Data from Model Gradient, Provably [68.21082086264555]
We reconstruct the training samples from a single gradient query at a randomly chosen parameter value. As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy.
arXiv Detail & Related papers (2022-12-07T15:32:22Z)
Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation [151.70234052015948]
We propose a novel approach that encourages the optimization algorithm to seek a flat trajectory. We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory. Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7%.
arXiv Detail & Related papers (2022-11-20T15:49:11Z)
Regression as Classification: Influence of Task Formulation on Neural Network Features [16.239708754973865]
Neural networks can be trained to solve regression problems by using gradient-based methods to minimize the square loss. practitioners often prefer to reformulate regression as a classification problem, observing that training on the cross entropy loss results in better performance. By focusing on two-layer ReLU networks, we explore how the implicit bias induced by gradient-based optimization could partly explain the phenomenon.
arXiv Detail & Related papers (2022-11-10T15:13:23Z)
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data [63.34506218832164]
In this work, we investigate the implicit bias of gradient flow and gradient descent in two-layer fully-connected neural networks with ReLU activations. For gradient flow, we leverage recent work on the implicit bias for homogeneous neural networks to show that leakyally, gradient flow produces a neural network with rank at most two. For gradient descent, provided the random variance is small enough, we show that a single step of gradient descent suffices to drastically reduce the rank of the network, and that the rank remains small throughout training.
arXiv Detail & Related papers (2022-10-13T15:09:54Z)
RISP: Rendering-Invariant State Predictor with Differentiable Simulation and Rendering for Cross-Domain Parameter Estimation [110.4255414234771]
Existing solutions require massive training data or lack generalizability to unknown rendering configurations. We propose a novel approach that marries domain randomization and differentiable rendering gradients to address this problem. Our approach achieves significantly lower reconstruction errors and has better generalizability among unknown rendering configurations.
arXiv Detail & Related papers (2022-05-11T17:59:51Z)
Unsupervised Restoration of Weather-affected Images using Deep Gaussian Process-based CycleGAN [92.15895515035795]
We describe an approach for supervising deep networks that are based on CycleGAN. We introduce new losses for training CycleGAN that lead to more effective training, resulting in high-quality reconstructions. We demonstrate that the proposed method can be effectively applied to different restoration tasks like de-raining, de-hazing and de-snowing.
arXiv Detail & Related papers (2022-04-23T01:30:47Z)
Understanding Training-Data Leakage from Gradients in Neural Networks for Image Classification [11.272188531829016]
In many applications, we need to protect the training data from being leaked due to IP or privacy concerns. Recent works have demonstrated that it is possible to reconstruct the training data from gradients for an image-classification model when its architecture is known. We formulate the problem of training data reconstruction as solving an optimisation problem iteratively for each layer. We are able to attribute the potential leakage of the training data in a deep network to its architecture.
arXiv Detail & Related papers (2021-11-19T12:14:43Z)
Revealing and Protecting Labels in Distributed Training [3.18475216176047]
We propose a method to discover the set of labels of training samples from only the gradient of the last layer and the id to label mapping. We demonstrate the effectiveness of our method for model training in two domains - image classification, and automatic speech recognition.
arXiv Detail & Related papers (2021-10-31T17:57:49Z)
A block coordinate descent optimizer for classification problems exploiting convexity [0.0]
We introduce a coordinate descent method to deep linear networks for classification tasks that exploits convexity of the cross-entropy loss in the weights of the hidden layer. By alternating between a second-order method to find globally optimal parameters for the linear layer and gradient descent to the hidden layers, we ensure an optimal fit of the adaptive basis to data throughout training.
arXiv Detail & Related papers (2020-06-17T19:49:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.