Better Membership Inference Privacy Measurement through Discrepancy
- URL: http://arxiv.org/abs/2405.15140v1
- Date: Fri, 24 May 2024 01:33:22 GMT
- Title: Better Membership Inference Privacy Measurement through Discrepancy
- Authors: Ruihan Wu, Pengrun Huang, Kamalika Chaudhuri,
- Abstract summary: We propose a new empirical privacy metric that is an upper bound on the advantage of a family of membership inference attacks.
We show that this metric does not involve training multiple models, can be applied to large Imagenet classification models in-the-wild, and has higher advantage than existing metrics on models trained with more recent and sophisticated training recipes.
- Score: 25.48677069802298
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Membership Inference Attacks have emerged as a dominant method for empirically measuring privacy leakage from machine learning models. Here, privacy is measured by the {\em{advantage}} or gap between a score or a function computed on the training and the test data. A major barrier to the practical deployment of these attacks is that they do not scale to large well-generalized models -- either the advantage is relatively low, or the attack involves training multiple models which is highly compute-intensive. In this work, inspired by discrepancy theory, we propose a new empirical privacy metric that is an upper bound on the advantage of a family of membership inference attacks. We show that this metric does not involve training multiple models, can be applied to large Imagenet classification models in-the-wild, and has higher advantage than existing metrics on models trained with more recent and sophisticated training recipes. Motivated by our empirical results, we also propose new membership inference attacks tailored to these training losses.
Related papers
- Order of Magnitude Speedups for LLM Membership Inference [5.124111136127848]
Large Language Models (LLMs) have the promise to revolutionize computing broadly, but their complexity and extensive training data also expose privacy vulnerabilities.
One of the simplest privacy risks associated with LLMs is their susceptibility to membership inference attacks (MIAs)
We propose a low-cost MIA that leverages an ensemble of small quantile regression models to determine if a document belongs to the model's training set or not.
arXiv Detail & Related papers (2024-09-22T16:18:14Z) - Causal Estimation of Memorisation Profiles [58.20086589761273]
Understanding memorisation in language models has practical and societal implications.
Memorisation is the causal effect of training with an instance on the model's ability to predict that instance.
This paper proposes a new, principled, and efficient method to estimate memorisation based on the difference-in-differences design from econometrics.
arXiv Detail & Related papers (2024-06-06T17:59:09Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Confidence Is All You Need for MI Attacks [7.743155804758186]
We propose a new method to gauge a data point's membership in a model's training set.
During training, the model is essentially being 'fit' to the training data and might face particular difficulties in generalization to unseen data.
arXiv Detail & Related papers (2023-11-26T18:09:24Z) - Membership Inference Attacks by Exploiting Loss Trajectory [19.900473800648243]
We propose a new attack method, called system, which can exploit the membership information from the whole training process of the target model.
Our attack achieves at least 6$times$ higher true-positive rate at a low false-positive rate of 0.1% than existing methods.
arXiv Detail & Related papers (2022-08-31T16:02:26Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Self-Ensemble Adversarial Training for Improved Robustness [14.244311026737666]
Adversarial training is the strongest strategy against various adversarial attacks among all sorts of defense methods.
Recent works mainly focus on developing new loss functions or regularizers, attempting to find the unique optimal point in the weight space.
We devise a simple but powerful emphSelf-Ensemble Adversarial Training (SEAT) method for yielding a robust classifier by averaging weights of history models.
arXiv Detail & Related papers (2022-03-18T01:12:18Z) - Parameters or Privacy: A Provable Tradeoff Between Overparameterization
and Membership Inference [29.743945643424553]
Over parameterized models generalize well (small error on the test data) even when trained to memorize the training data (zero error on the training data)
This has led to an arms race towards increasingly over parameterized models (c.f., deep learning)
arXiv Detail & Related papers (2022-02-02T19:00:21Z) - MixKD: Towards Efficient Distillation of Large-scale Language Models [129.73786264834894]
We propose MixKD, a data-agnostic distillation framework, to endow the resulting model with stronger generalization ability.
We prove from a theoretical perspective that under reasonable conditions MixKD gives rise to a smaller gap between the error and the empirical error.
Experiments under a limited-data setting and ablation studies further demonstrate the advantages of the proposed approach.
arXiv Detail & Related papers (2020-11-01T18:47:51Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.