Investigating Membership Inference Attacks under Data Dependencies
- URL: http://arxiv.org/abs/2010.12112v4
- Date: Wed, 14 Jun 2023 14:01:11 GMT
- Title: Investigating Membership Inference Attacks under Data Dependencies
- Authors: Thomas Humphries, Simon Oya, Lindsey Tulloch, Matthew Rafuse, Ian
Goldberg, Urs Hengartner, Florian Kerschbaum
- Abstract summary: Training machine learning models on privacy-sensitive data has opened the door to new attacks that can have serious privacy implications.
One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model.
We evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed.
- Score: 26.70764798408236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training machine learning models on privacy-sensitive data has become a
popular practice, driving innovation in ever-expanding fields. This has opened
the door to new attacks that can have serious privacy implications. One such
attack, the Membership Inference Attack (MIA), exposes whether or not a
particular data point was used to train a model. A growing body of literature
uses Differentially Private (DP) training algorithms as a defence against such
attacks. However, these works evaluate the defence under the restrictive
assumption that all members of the training set, as well as non-members, are
independent and identically distributed. This assumption does not hold for many
real-world use cases in the literature. Motivated by this, we evaluate
membership inference with statistical dependencies among samples and explain
why DP does not provide meaningful protection (the privacy parameter $\epsilon$
scales with the training set size $n$) in this more general case. We conduct a
series of empirical evaluations with off-the-shelf MIAs using training sets
built from real-world data showing different types of dependencies among
samples. Our results reveal that training set dependencies can severely
increase the performance of MIAs, and therefore assuming that data samples are
statistically independent can significantly underestimate the performance of
MIAs.
Related papers
- Range Membership Inference Attacks [17.28638946021444]
We introduce the class of range membership inference attacks (RaMIAs), testing if the model was trained on any data in a specified range.
We show that RaMIAs can capture privacy loss more accurately and comprehensively than MIAs on various types of data.
arXiv Detail & Related papers (2024-08-09T15:39:06Z) - Do Membership Inference Attacks Work on Large Language Models? [141.2019867466968]
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data.
We perform a large-scale evaluation of MIAs over a suite of language models trained on the Pile, ranging from 160M to 12B parameters.
We find that MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.
arXiv Detail & Related papers (2024-02-12T17:52:05Z) - Assessing Privacy Risks in Language Models: A Case Study on
Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack.
We exploit text similarity and the model's resistance to document modifications as potential MI signals.
We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z) - Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Lessons Learned: Defending Against Property Inference Attacks [0.0]
This work investigates and evaluates multiple defense strategies against property inference attacks (PIAs)
PIAs aim to extract statistical properties of its underlying training data, e.g., reveal the ratio of men and women in a medical training data set.
Experiments with property unlearning show that property unlearning is not able to generalize, i.e., protect against a whole class of PIAs.
arXiv Detail & Related papers (2022-05-18T09:38:37Z) - Enhanced Membership Inference Attacks against Machine Learning Models [9.26208227402571]
Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set.
We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance.
Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
arXiv Detail & Related papers (2021-11-18T13:31:22Z) - Do Not Trust Prediction Scores for Membership Inference Attacks [15.567057178736402]
Membership inference attacks (MIAs) aim to determine whether a specific sample was used to train a predictive model.
We argue that this is a fallacy for many modern deep network architectures.
We are able to produce a potentially infinite number of samples falsely classified as part of the training data.
arXiv Detail & Related papers (2021-11-17T12:39:04Z) - Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters.
We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data.
Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.