Related papers: Investigating Membership Inference Attacks under Data Dependencies

Investigating Membership Inference Attacks under Data Dependencies

URL: http://arxiv.org/abs/2010.12112v4
Date: Wed, 14 Jun 2023 14:01:11 GMT
Title: Investigating Membership Inference Attacks under Data Dependencies
Authors: Thomas Humphries, Simon Oya, Lindsey Tulloch, Matthew Rafuse, Ian Goldberg, Urs Hengartner, Florian Kerschbaum
Abstract summary: Training machine learning models on privacy-sensitive data has opened the door to new attacks that can have serious privacy implications. One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model. We evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed.
Score: 26.70764798408236
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Training machine learning models on privacy-sensitive data has become a popular practice, driving innovation in ever-expanding fields. This has opened the door to new attacks that can have serious privacy implications. One such attack, the Membership Inference Attack (MIA), exposes whether or not a particular data point was used to train a model. A growing body of literature uses Differentially Private (DP) training algorithms as a defence against such attacks. However, these works evaluate the defence under the restrictive assumption that all members of the training set, as well as non-members, are independent and identically distributed. This assumption does not hold for many real-world use cases in the literature. Motivated by this, we evaluate membership inference with statistical dependencies among samples and explain why DP does not provide meaningful protection (the privacy parameter $\epsilon$ scales with the training set size $n$) in this more general case. We conduct a series of empirical evaluations with off-the-shelf MIAs using training sets built from real-world data showing different types of dependencies among samples. Our results reveal that training set dependencies can severely increase the performance of MIAs, and therefore assuming that data samples are statistically independent can significantly underestimate the performance of MIAs.

Related papers

Membership Inference Attack Should Move On to Distributional Statistics for Distilled Generative Models [31.834967019893227]
Membership inference attacks (MIAs) determine whether certain data instances were used to train a model. This paper reveals an oversight in existing MIAs against emphdistilled generative models We introduce a emphset-based MIA framework that measures emphrelative distributional discrepancies between student-generated dataemphsets and potential member/non-member dataemphsets
arXiv Detail & Related papers (2025-02-05T08:11:23Z)
Range Membership Inference Attacks [17.28638946021444]
We introduce the class of range membership inference attacks (RaMIAs), testing if the model was trained on any data in a specified range. We show that RaMIAs can capture privacy loss more accurately and comprehensively than MIAs on various types of data.
arXiv Detail & Related papers (2024-08-09T15:39:06Z)
Do Membership Inference Attacks Work on Large Language Models? [141.2019867466968]
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data. We perform a large-scale evaluation of MIAs over a suite of language models trained on the Pile, ranging from 160M to 12B parameters. We find that MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.
arXiv Detail & Related papers (2024-02-12T17:52:05Z)
Assessing Privacy Risks in Language Models: A Case Study on Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack. We exploit text similarity and the model's resistance to document modifications as potential MI signals. We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z)
Membership Inference Attacks against Synthetic Data through Overfitting Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution. We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z)
RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target. RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead. Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z)
Lessons Learned: Defending Against Property Inference Attacks [0.0]
This work investigates and evaluates multiple defense strategies against property inference attacks (PIAs) PIAs aim to extract statistical properties of its underlying training data, e.g., reveal the ratio of men and women in a medical training data set. Experiments with property unlearning show that property unlearning is not able to generalize, i.e., protect against a whole class of PIAs.
arXiv Detail & Related papers (2022-05-18T09:38:37Z)
Enhanced Membership Inference Attacks against Machine Learning Models [9.26208227402571]
Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set. We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance. Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
arXiv Detail & Related papers (2021-11-18T13:31:22Z)
Do Not Trust Prediction Scores for Membership Inference Attacks [15.567057178736402]
Membership inference attacks (MIAs) aim to determine whether a specific sample was used to train a predictive model. We argue that this is a fallacy for many modern deep network architectures. We are able to produce a potentially infinite number of samples falsely classified as part of the training data.
arXiv Detail & Related papers (2021-11-17T12:39:04Z)
Knowledge-Enriched Distributional Model Inversion Attacks [49.43828150561947]
Model inversion (MI) attacks are aimed at reconstructing training data from model parameters. We present a novel inversion-specific GAN that can better distill knowledge useful for performing attacks on private models from public data. Our experiments show that the combination of these techniques can significantly boost the success rate of the state-of-the-art MI attacks by 150%.
arXiv Detail & Related papers (2020-10-08T16:20:48Z)
Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model. We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance. For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.