Reconstructing Training Data with Informed Adversaries
- URL: http://arxiv.org/abs/2201.04845v1
- Date: Thu, 13 Jan 2022 09:19:25 GMT
- Title: Reconstructing Training Data with Informed Adversaries
- Authors: Borja Balle, Giovanni Cherubin, Jamie Hayes
- Abstract summary: Given access to a machine learning model, can an adversary reconstruct the model's training data?
This work studies this question from the lens of a powerful informed adversary who knows all the training data points except one.
We show it is feasible to reconstruct the remaining data point in this stringent threat model.
- Score: 30.138217209991826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given access to a machine learning model, can an adversary reconstruct the
model's training data? This work studies this question from the lens of a
powerful informed adversary who knows all the training data points except one.
By instantiating concrete attacks, we show it is feasible to reconstruct the
remaining data point in this stringent threat model. For convex models (e.g.
logistic regression), reconstruction attacks are simple and can be derived in
closed-form. For more general models (e.g. neural networks), we propose an
attack strategy based on training a reconstructor network that receives as
input the weights of the model under attack and produces as output the target
data point. We demonstrate the effectiveness of our attack on image classifiers
trained on MNIST and CIFAR-10, and systematically investigate which factors of
standard machine learning pipelines affect reconstruction success. Finally, we
theoretically investigate what amount of differential privacy suffices to
mitigate reconstruction attacks by informed adversaries. Our work provides an
effective reconstruction attack that model developers can use to assess
memorization of individual points in general settings beyond those considered
in previous works (e.g. generative language models or access to training
gradients); it shows that standard models have the capacity to store enough
information to enable high-fidelity reconstruction of training data points; and
it demonstrates that differential privacy can successfully mitigate such
attacks in a parameter regime where utility degradation is minimal.
Related papers
- Reconstruction Attacks on Machine Unlearning: Simple Models are Vulnerable [30.22146634953896]
We show how to mount a near-perfect attack on the deleted data point from linear regression models.
Our work highlights that privacy risk is significant even for extremely simple model classes when individuals can request deletion of their data from the model.
arXiv Detail & Related papers (2024-05-30T17:27:44Z) - Bounding Reconstruction Attack Success of Adversaries Without Data
Priors [53.41619942066895]
Reconstruction attacks on machine learning (ML) models pose a strong risk of leakage of sensitive data.
In this work, we provide formal upper bounds on reconstruction success under realistic adversarial settings.
arXiv Detail & Related papers (2024-02-20T09:52:30Z) - Membership Inference Attacks on Diffusion Models via Quantile Regression [30.30033625685376]
We demonstrate a privacy vulnerability of diffusion models through amembership inference (MI) attack.
Our proposed MI attack learns quantile regression models that predict (a quantile of) the distribution of reconstruction loss on examples not used in training.
We show that our attack outperforms the prior state-of-the-art attack while being substantially less computationally expensive.
arXiv Detail & Related papers (2023-12-08T16:21:24Z) - Boosting Model Inversion Attacks with Adversarial Examples [26.904051413441316]
We propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting.
First, we regularize the training process of the attack model with an added semantic loss function.
Second, we inject adversarial examples into the training data to increase the diversity of the class-related parts.
arXiv Detail & Related papers (2023-06-24T13:40:58Z) - Deconstructing Classifiers: Towards A Data Reconstruction Attack Against
Text Classification Models [2.9735729003555345]
We propose a new targeted data reconstruction attack called the Mix And Match attack.
This work highlights the importance of considering the privacy risks associated with data reconstruction attacks in classification models.
arXiv Detail & Related papers (2023-06-23T21:25:38Z) - Understanding Reconstruction Attacks with the Neural Tangent Kernel and
Dataset Distillation [110.61853418925219]
We build a stronger version of the dataset reconstruction attack and show how it can provably recover the emphentire training set in the infinite width regime.
We show that both theoretically and empirically, reconstructed images tend to "outliers" in the dataset.
These reconstruction attacks can be used for textitdataset distillation, that is, we can retrain on reconstructed images and obtain high predictive accuracy.
arXiv Detail & Related papers (2023-02-02T21:41:59Z) - Reconstructing Training Data from Model Gradient, Provably [68.21082086264555]
We reconstruct the training samples from a single gradient query at a randomly chosen parameter value.
As a provable attack that reveals sensitive training data, our findings suggest potential severe threats to privacy.
arXiv Detail & Related papers (2022-12-07T15:32:22Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - Exploring the Security Boundary of Data Reconstruction via Neuron
Exclusivity Analysis [23.07323180340961]
We study the security boundary of data reconstruction from gradient via a microcosmic view on neural networks with rectified linear units (ReLUs)
We construct a novel deterministic attack algorithm which substantially outperforms previous attacks for reconstructing training batches lying in the insecure boundary of a neural network.
arXiv Detail & Related papers (2020-10-26T05:54:47Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.