Formalizing and Estimating Distribution Inference Risks
- URL: http://arxiv.org/abs/2109.06024v2
- Date: Tue, 14 Sep 2021 15:11:20 GMT
- Title: Formalizing and Estimating Distribution Inference Risks
- Authors: Anshuman Suri and David Evans
- Abstract summary: We propose a formal and general definition of property inference attacks.
Our results show that inexpensive attacks are as effective as expensive meta-classifier attacks.
We extend the state-of-the-art property inference attack to work on convolutional neural networks.
- Score: 11.650381752104298
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Property inference attacks reveal statistical properties about a training set
but are difficult to distinguish from the intrinsic purpose of statistical
machine learning, namely to produce models that capture statistical properties
about a distribution. Motivated by Yeom et al.'s membership inference
framework, we propose a formal and general definition of property inference
attacks. The proposed notion describes attacks that can distinguish between
possible training distributions, extending beyond previous property inference
attacks that infer the ratio of a particular type of data in the training data
set such as the proportion of females. We show how our definition captures
previous property inference attacks as well as a new attack that can reveal the
average node degree or clustering coefficient of a training graph. Our
definition also enables a theorem that connects the maximum possible accuracy
of inference attacks distinguishing between distributions to the effective size
of dataset leaked by the model. To quantify and understand property inference
risks, we conduct a series of experiments across a range of different
distributions using both black-box and white-box attacks. Our results show that
inexpensive attacks are often as effective as expensive meta-classifier
attacks, and that there are surprising asymmetries in the effectiveness of
attacks. We also extend the state-of-the-art property inference attack to work
on convolutional neural networks, and propose techniques to help identify
parameters in a model that leak the most information, thus significantly
lowering resource requirements for meta-classifier attacks.
Related papers
- FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning
Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks.
FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain.
We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z) - Membership Inference Attacks against Language Models via Neighbourhood
Comparison [45.086816556309266]
Membership Inference attacks (MIAs) aim to predict whether a data sample was present in the training data of a machine learning model or not.
Recent work has demonstrated that reference-based attacks which compare model scores to those obtained from a reference model trained on similar data can substantially improve the performance of MIAs.
We investigate their performance in more realistic scenarios and find that they are highly fragile in relation to the data distribution used to train reference models.
arXiv Detail & Related papers (2023-05-29T07:06:03Z) - Can Adversarial Examples Be Parsed to Reveal Victim Model Information? [62.814751479749695]
In this work, we ask whether it is possible to infer data-agnostic victim model (VM) information from data-specific adversarial instances.
We collect a dataset of adversarial attacks across 7 attack types generated from 135 victim models.
We show that a simple, supervised model parsing network (MPN) is able to infer VM attributes from unseen adversarial attacks.
arXiv Detail & Related papers (2023-03-13T21:21:49Z) - Purifier: Defending Data Inference Attacks via Transforming Confidence
Scores [27.330482508047428]
We propose a method, namely PURIFIER, to defend against membership inference attacks.
Experiments show that PURIFIER helps defend membership inference attacks with high effectiveness and efficiency.
PURIFIER is also effective in defending adversarial model inversion attacks and attribute inference attacks.
arXiv Detail & Related papers (2022-12-01T16:09:50Z) - Enhanced Membership Inference Attacks against Machine Learning Models [9.26208227402571]
Membership inference attacks are used to quantify the private information that a model leaks about the individual data points in its training set.
We derive new attack algorithms that can achieve a high AUC score while also highlighting the different factors that affect their performance.
Our algorithms capture a very precise approximation of privacy loss in models, and can be used as a tool to perform an accurate and informed estimation of privacy risk in machine learning models.
arXiv Detail & Related papers (2021-11-18T13:31:22Z) - Formalizing Distribution Inference Risks [11.650381752104298]
Property inference attacks are difficult to distinguish from the primary purposes of statistical machine learning.
We propose a formal and generic definition of property inference attacks.
arXiv Detail & Related papers (2021-06-07T15:10:06Z) - Property Inference Attacks on Convolutional Neural Networks: Influence
and Implications of Target Model's Complexity [1.2891210250935143]
Property Inference Attacks aim to infer from a given model properties about the training dataset seemingly unrelated to the model's primary goal.
This paper investigates the influence of the target model's complexity on the accuracy of this type of attack.
Our findings reveal that the risk of a privacy breach is present independently of the target model's complexity.
arXiv Detail & Related papers (2021-04-27T09:19:36Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.