Privacy Analysis of Deep Learning in the Wild: Membership Inference
Attacks against Transfer Learning
- URL: http://arxiv.org/abs/2009.04872v1
- Date: Thu, 10 Sep 2020 14:14:22 GMT
- Title: Privacy Analysis of Deep Learning in the Wild: Membership Inference
Attacks against Transfer Learning
- Authors: Yang Zou, Zhikun Zhang, Michael Backes, Yang Zhang
- Abstract summary: We present the first systematic evaluation of membership inference attacks against transfer learning models.
Experiments on four real-world image datasets show that membership inference can achieve effective performance.
Our results shed light on the severity of membership risks stemming from machine learning models in practice.
- Score: 27.494206948563885
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While being deployed in many critical applications as core components,
machine learning (ML) models are vulnerable to various security and privacy
attacks. One major privacy attack in this domain is membership inference, where
an adversary aims to determine whether a target data sample is part of the
training set of a target ML model. So far, most of the current membership
inference attacks are evaluated against ML models trained from scratch.
However, real-world ML models are typically trained following the transfer
learning paradigm, where a model owner takes a pretrained model learned from a
different dataset, namely teacher model, and trains her own student model by
fine-tuning the teacher model with her own data.
In this paper, we perform the first systematic evaluation of membership
inference attacks against transfer learning models. We adopt the strategy of
shadow model training to derive the data for training our membership inference
classifier. Extensive experiments on four real-world image datasets show that
membership inference can achieve effective performance. For instance, on the
CIFAR100 classifier transferred from ResNet20 (pretrained with Caltech101), our
membership inference achieves $95\%$ attack AUC. Moreover, we show that
membership inference is still effective when the architecture of target model
is unknown. Our results shed light on the severity of membership risks stemming
from machine learning models in practice.
Related papers
- Order of Magnitude Speedups for LLM Membership Inference [5.124111136127848]
Large Language Models (LLMs) have the promise to revolutionize computing broadly, but their complexity and extensive training data also expose privacy vulnerabilities.
One of the simplest privacy risks associated with LLMs is their susceptibility to membership inference attacks (MIAs)
We propose a low-cost MIA that leverages an ensemble of small quantile regression models to determine if a document belongs to the model's training set or not.
arXiv Detail & Related papers (2024-09-22T16:18:14Z) - Do Membership Inference Attacks Work on Large Language Models? [141.2019867466968]
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data.
We perform a large-scale evaluation of MIAs over a suite of language models trained on the Pile, ranging from 160M to 12B parameters.
We find that MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.
arXiv Detail & Related papers (2024-02-12T17:52:05Z) - SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models [74.58014281829946]
We analyze the effectiveness of several representative attacks/defenses, including model stealing attacks, membership inference attacks, and backdoor detection on public models.
Our evaluation empirically shows the performance of these attacks/defenses can vary significantly on public models compared to self-trained models.
arXiv Detail & Related papers (2023-10-19T11:49:22Z) - CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated
Learning [77.27443885999404]
Federated Learning (FL) is a setting for training machine learning models in distributed environments.
We propose a novel method, CANIFE, that uses carefully crafted samples by a strong adversary to evaluate the empirical privacy of a training round.
arXiv Detail & Related papers (2022-10-06T13:30:16Z) - Membership Inference Attacks by Exploiting Loss Trajectory [19.900473800648243]
We propose a new attack method, called system, which can exploit the membership information from the whole training process of the target model.
Our attack achieves at least 6$times$ higher true-positive rate at a low false-positive rate of 0.1% than existing methods.
arXiv Detail & Related papers (2022-08-31T16:02:26Z) - l-Leaks: Membership Inference Attacks with Logits [5.663757165885866]
We present attacks based on black-box access to the target model. We name our attack textbfl-Leaks.
We build the shadow model by learning the logits of the target model and making the shadow model more similar to the target model. Then shadow model will have sufficient confidence in the member samples of the target model.
arXiv Detail & Related papers (2022-05-13T06:59:09Z) - Mitigating Membership Inference Attacks by Self-Distillation Through a
Novel Ensemble Architecture [44.2351146468898]
Membership inference attacks are a key measure to evaluate privacy leakage in machine learning (ML) models.
We propose a new framework to train privacy-preserving models that induce similar behavior on member and non-member inputs.
We show that SELENA presents a superior trade-off between membership privacy and utility compared to the state of the art.
arXiv Detail & Related papers (2021-10-15T19:22:52Z) - ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine
Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc.
We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing.
Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z) - Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer
Learning [60.784641458579124]
We show that fine-tuning effectively enhances model robustness under white-box FGSM attacks.
We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model.
To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model.
arXiv Detail & Related papers (2020-08-25T15:04:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.