Membership Leakage in Label-Only Exposures
- URL: http://arxiv.org/abs/2007.15528v3
- Date: Fri, 17 Sep 2021 17:40:37 GMT
- Title: Membership Leakage in Label-Only Exposures
- Authors: Zheng Li and Yang Zhang
- Abstract summary: We propose decision-based membership inference attacks against machine learning models.
In particular, we develop two types of decision-based attacks, namely transfer attack, and boundary attack.
We also present new insights on the success of membership inference based on quantitative and qualitative analysis.
- Score: 10.875144776014533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) has been widely adopted in various privacy-critical
applications, e.g., face recognition and medical image analysis. However,
recent research has shown that ML models are vulnerable to attacks against
their training data. Membership inference is one major attack in this domain:
Given a data sample and model, an adversary aims to determine whether the
sample is part of the model's training set. Existing membership inference
attacks leverage the confidence scores returned by the model as their inputs
(score-based attacks). However, these attacks can be easily mitigated if the
model only exposes the predicted label, i.e., the final model decision.
In this paper, we propose decision-based membership inference attacks and
demonstrate that label-only exposures are also vulnerable to membership
leakage. In particular, we develop two types of decision-based attacks, namely
transfer attack, and boundary attack. Empirical evaluation shows that our
decision-based attacks can achieve remarkable performance, and even outperform
the previous score-based attacks in some cases. We further present new insights
on the success of membership inference based on quantitative and qualitative
analysis, i.e., member samples of a model are more distant to the model's
decision boundary than non-member samples. Finally, we evaluate multiple
defense mechanisms against our decision-based attacks and show that our two
types of attacks can bypass most of these defenses.
Related papers
- Model X-ray:Detecting Backdoored Models via Decision Boundary [62.675297418960355]
Backdoor attacks pose a significant security vulnerability for deep neural networks (DNNs)
We propose Model X-ray, a novel backdoor detection approach based on the analysis of illustrated two-dimensional (2D) decision boundaries.
Our approach includes two strategies focused on the decision areas dominated by clean samples and the concentration of label distribution.
arXiv Detail & Related papers (2024-02-27T12:42:07Z) - Membership Inference Attacks by Exploiting Loss Trajectory [19.900473800648243]
We propose a new attack method, called system, which can exploit the membership information from the whole training process of the target model.
Our attack achieves at least 6$times$ higher true-positive rate at a low false-positive rate of 0.1% than existing methods.
arXiv Detail & Related papers (2022-08-31T16:02:26Z) - Membership-Doctor: Comprehensive Assessment of Membership Inference
Against Machine Learning Models [11.842337448801066]
We present a large-scale measurement of different membership inference attacks and defenses.
We find that some assumptions of the threat model, such as same-architecture and same-distribution between shadow and target models, are unnecessary.
We are also the first to execute attacks on the real-world data collected from the Internet, instead of laboratory datasets.
arXiv Detail & Related papers (2022-08-22T17:00:53Z) - A Unified Evaluation of Textual Backdoor Learning: Frameworks and
Benchmarks [72.7373468905418]
We develop an open-source toolkit OpenBackdoor to foster the implementations and evaluations of textual backdoor learning.
We also propose CUBE, a simple yet strong clustering-based defense baseline.
arXiv Detail & Related papers (2022-06-17T02:29:23Z) - Identifying a Training-Set Attack's Target Using Renormalized Influence
Estimation [11.663072799764542]
This work proposes the task of target identification, which determines whether a specific test instance is the target of a training-set attack.
Rather than focusing on a single attack method or data modality, we build on influence estimation, which quantifies each training instance's contribution to a model's prediction.
arXiv Detail & Related papers (2022-01-25T02:36:34Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - Membership Inference Attacks on Machine Learning: A Survey [6.468846906231666]
Membership inference attack aims to identify whether a data sample was used to train a machine learning model or not.
It can raise severe privacy risks as the membership can reveal an individual's sensitive information.
We present the first comprehensive survey of membership inference attacks.
arXiv Detail & Related papers (2021-03-14T06:10:47Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z) - Label-Only Membership Inference Attacks [67.46072950620247]
We introduce label-only membership inference attacks.
Our attacks evaluate the robustness of a model's predicted labels under perturbations.
We find that training models with differential privacy and (strong) L2 regularization are the only known defense strategies.
arXiv Detail & Related papers (2020-07-28T15:44:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.