Enhanced Membership Inference Attacks against Machine Learning Models
- URL: http://arxiv.org/abs/2111.09679v1
- Date: Thu, 18 Nov 2021 13:31:22 GMT
- Title: Enhanced Membership Inference Attacks against Machine Learning Models
- Authors: Jiayuan Ye, Aadyaa Maddi, Sasi Kumar Murakonda, Reza Shokri
- Abstract summary: Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set.
We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance.
Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
- Score: 9.26208227402571
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: How much does a given trained model leak about each individual data record in
its training set? Membership inference attacks are used as an auditing tool to
quantify the private information that a model leaks about the individual data
points in its training set. Membership inference attacks are influenced by
different uncertainties that an attacker has to resolve about training data,
the training algorithm, and the underlying data distribution. Thus attack
success rates, of many attacks in the literature, do not precisely capture the
information leakage of models about their data, as they also reflect other
uncertainties that the attack algorithm has. In this paper, we explain the
implicit assumptions and also the simplifications made in prior work using the
framework of hypothesis testing. We also derive new attack algorithms from the
framework that can achieve a high AUC score while also highlighting the
different factors that affect their performance. Our algorithms capture a very
precise approximation of privacy loss in models, and can be used as a tool to
perform an accurate and informed estimation of privacy risk in machine learning
models. We provide a thorough empirical evaluation of our attack strategies on
various machine learning tasks and benchmark datasets.
Related papers
- Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts.
Existing approaches require re-training models on different data subsets, which is computationally intensive.
This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z) - Assessing Privacy Risks in Language Models: A Case Study on
Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack.
We exploit text similarity and the model's resistance to document modifications as potential MI signals.
We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z) - Boosting Model Inversion Attacks with Adversarial Examples [26.904051413441316]
We propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting.
First, we regularize the training process of the attack model with an added semantic loss function.
Second, we inject adversarial examples into the training data to increase the diversity of the class-related parts.
arXiv Detail & Related papers (2023-06-24T13:40:58Z) - Learning to Unlearn: Instance-wise Unlearning for Pre-trained
Classifiers [71.70205894168039]
We consider instance-wise unlearning, of which the goal is to delete information on a set of instances from a pre-trained model.
We propose two methods that reduce forgetting on the remaining data: 1) utilizing adversarial examples to overcome forgetting at the representation-level and 2) leveraging weight importance metrics to pinpoint network parameters guilty of propagating unwanted information.
arXiv Detail & Related papers (2023-01-27T07:53:50Z) - Membership Inference Attacks by Exploiting Loss Trajectory [19.900473800648243]
We propose a new attack method, called system, which can exploit the membership information from the whole training process of the target model.
Our attack achieves at least 6$times$ higher true-positive rate at a low false-positive rate of 0.1% than existing methods.
arXiv Detail & Related papers (2022-08-31T16:02:26Z) - Truth Serum: Poisoning Machine Learning Models to Reveal Their Secrets [53.866927712193416]
We show that an adversary who can poison a training dataset can cause models trained on this dataset to leak private details belonging to other parties.
Our attacks are effective across membership inference, attribute inference, and data extraction.
Our results cast doubts on the relevance of cryptographic privacy guarantees in multiparty protocols for machine learning.
arXiv Detail & Related papers (2022-03-31T18:06:28Z) - Leveraging Adversarial Examples to Quantify Membership Information
Leakage [30.55736840515317]
We develop a novel approach to address the problem of membership inference in pattern recognition models.
We argue that this quantity reflects the likelihood of belonging to the training data.
Our method performs comparable or even outperforms state-of-the-art strategies.
arXiv Detail & Related papers (2022-03-17T19:09:38Z) - Formalizing and Estimating Distribution Inference Risks [11.650381752104298]
We propose a formal and general definition of property inference attacks.
Our results show that inexpensive attacks are as effective as expensive meta-classifier attacks.
We extend the state-of-the-art property inference attack to work on convolutional neural networks.
arXiv Detail & Related papers (2021-09-13T14:54:39Z) - Property Inference Attacks on Convolutional Neural Networks: Influence
and Implications of Target Model's Complexity [1.2891210250935143]
Property Inference Attacks aim to infer from a given model properties about the training dataset seemingly unrelated to the model's primary goal.
This paper investigates the influence of the target model's complexity on the accuracy of this type of attack.
Our findings reveal that the risk of a privacy breach is present independently of the target model's complexity.
arXiv Detail & Related papers (2021-04-27T09:19:36Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.