Related papers: Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning

URL: http://arxiv.org/abs/2009.04872v1
Date: Thu, 10 Sep 2020 14:14:22 GMT
Title: Privacy Analysis of Deep Learning in the Wild: Membership Inference Attacks against Transfer Learning
Authors: Yang Zou, Zhikun Zhang, Michael Backes, Yang Zhang
Abstract summary: We present the first systematic evaluation of membership inference attacks against transfer learning models. Experiments on four real-world image datasets show that membership inference can achieve effective performance. Our results shed light on the severity of membership risks stemming from machine learning models in practice.
Score: 27.494206948563885
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While being deployed in many critical applications as core components, machine learning (ML) models are vulnerable to various security and privacy attacks. One major privacy attack in this domain is membership inference, where an adversary aims to determine whether a target data sample is part of the training set of a target ML model. So far, most of the current membership inference attacks are evaluated against ML models trained from scratch. However, real-world ML models are typically trained following the transfer learning paradigm, where a model owner takes a pretrained model learned from a different dataset, namely teacher model, and trains her own student model by fine-tuning the teacher model with her own data. In this paper, we perform the first systematic evaluation of membership inference attacks against transfer learning models. We adopt the strategy of shadow model training to derive the data for training our membership inference classifier. Extensive experiments on four real-world image datasets show that membership inference can achieve effective performance. For instance, on the CIFAR100 classifier transferred from ResNet20 (pretrained with Caltech101), our membership inference achieves $95\%$ attack AUC. Moreover, we show that membership inference is still effective when the architecture of target model is unknown. Our results shed light on the severity of membership risks stemming from machine learning models in practice.

Related papers

Order of Magnitude Speedups for LLM Membership Inference [5.124111136127848]
Large Language Models (LLMs) have the promise to revolutionize computing broadly, but their complexity and extensive training data also expose privacy vulnerabilities. One of the simplest privacy risks associated with LLMs is their susceptibility to membership inference attacks (MIAs) We propose a low-cost MIA that leverages an ensemble of small quantile regression models to determine if a document belongs to the model's training set or not.
arXiv Detail & Related papers (2024-09-22T16:18:14Z)
Do Membership Inference Attacks Work on Large Language Models? [141.2019867466968]
Membership inference attacks (MIAs) attempt to predict whether a particular datapoint is a member of a target model's training data. We perform a large-scale evaluation of MIAs over a suite of language models trained on the Pile, ranging from 160M to 12B parameters. We find that MIAs barely outperform random guessing for most settings across varying LLM sizes and domains.
arXiv Detail & Related papers (2024-02-12T17:52:05Z)
SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models [74.58014281829946]
We analyze the effectiveness of several representative attacks/defenses, including model stealing attacks, membership inference attacks, and backdoor detection on public models. Our evaluation empirically shows the performance of these attacks/defenses can vary significantly on public models compared to self-trained models.
arXiv Detail & Related papers (2023-10-19T11:49:22Z)
CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning [77.27443885999404]
Federated Learning (FL) is a setting for training machine learning models in distributed environments. We propose a novel method, CANIFE, that uses carefully crafted samples by a strong adversary to evaluate the empirical privacy of a training round.
arXiv Detail & Related papers (2022-10-06T13:30:16Z)
Membership Inference Attacks by Exploiting Loss Trajectory [19.900473800648243]
We propose a new attack method, called system, which can exploit the membership information from the whole training process of the target model. Our attack achieves at least 6$times$ higher true-positive rate at a low false-positive rate of 0.1% than existing methods.
arXiv Detail & Related papers (2022-08-31T16:02:26Z)
l-Leaks: Membership Inference Attacks with Logits [5.663757165885866]
We present attacks based on black-box access to the target model. We name our attack textbfl-Leaks. We build the shadow model by learning the logits of the target model and making the shadow model more similar to the target model. Then shadow model will have sufficient confidence in the member samples of the target model.
arXiv Detail & Related papers (2022-05-13T06:59:09Z)
Mitigating Membership Inference Attacks by Self-Distillation Through a Novel Ensemble Architecture [44.2351146468898]
Membership inference attacks are a key measure to evaluate privacy leakage in machine learning (ML) models. We propose a new framework to train privacy-preserving models that induce similar behavior on member and non-member inputs. We show that SELENA presents a superior trade-off between membership privacy and utility compared to the state of the art.
arXiv Detail & Related papers (2021-10-15T19:22:52Z)
ML-Doctor: Holistic Risk Assessment of Inference Attacks Against Machine Learning Models [64.03398193325572]
Inference attacks against Machine Learning (ML) models allow adversaries to learn about training data, model parameters, etc. We concentrate on four attacks - namely, membership inference, model inversion, attribute inference, and model stealing. Our analysis relies on a modular re-usable software, ML-Doctor, which enables ML model owners to assess the risks of deploying their models.
arXiv Detail & Related papers (2021-02-04T11:35:13Z)
Sampling Attacks: Amplification of Membership Inference Attacks by Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model. We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance. For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z)
Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer Learning [60.784641458579124]
We show that fine-tuning effectively enhances model robustness under white-box FGSM attacks. We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model. To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model.
arXiv Detail & Related papers (2020-08-25T15:04:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.