TransMIA: Membership Inference Attacks Using Transfer Shadow Training
- URL: http://arxiv.org/abs/2011.14661v3
- Date: Fri, 23 Apr 2021 14:50:44 GMT
- Title: TransMIA: Membership Inference Attacks Using Transfer Shadow Training
- Authors: Seira Hidano, Takao Murakami, Yusuke Kawamoto
- Abstract summary: We propose TransMIA (Transfer learning-based Membership Inference Attacks), which use transfer learning to perform membership inference attacks on the source model.
In particular, we propose a transfer shadow training technique, where an adversary employs the parameters of the transferred model to construct shadow models.
We evaluate our attacks using two real datasets, and show that our attacks outperform the state-of-the-art that does not use our transfer shadow training technique.
- Score: 5.22523722171238
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Transfer learning has been widely studied and gained increasing popularity to
improve the accuracy of machine learning models by transferring some knowledge
acquired in different training. However, no prior work has pointed out that
transfer learning can strengthen privacy attacks on machine learning models. In
this paper, we propose TransMIA (Transfer learning-based Membership Inference
Attacks), which use transfer learning to perform membership inference attacks
on the source model when the adversary is able to access the parameters of the
transferred model. In particular, we propose a transfer shadow training
technique, where an adversary employs the parameters of the transferred model
to construct shadow models, to significantly improve the performance of
membership inference when a limited amount of shadow training data is available
to the adversary. We evaluate our attacks using two real datasets, and show
that our attacks outperform the state-of-the-art that does not use our transfer
shadow training technique. We also compare four combinations of the
learning-based/entropy-based approach and the fine-tuning/freezing approach,
all of which employ our transfer shadow training technique. Then we examine the
performance of these four approaches based on the distributions of confidence
values, and discuss possible countermeasures against our attacks.
Related papers
- Prompt-Driven Contrastive Learning for Transferable Adversarial Attacks [42.18755809782401]
We propose a novel transfer attack method called PDCL-Attack.
We formulate an effective prompt-driven feature guidance by harnessing the semantic representation power of text.
arXiv Detail & Related papers (2024-07-30T08:52:16Z) - SA-Attack: Improving Adversarial Transferability of Vision-Language
Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios.
We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Adversarial Training Helps Transfer Learning via Better Representations [17.497590668804055]
Transfer learning aims to leverage models pre-trained on source data to efficiently adapt to target setting.
Recent works empirically demonstrate that adversarial training in the source data can improve the ability of models to transfer to new domains.
We show that adversarial training in the source data generates provably better representations, so fine-tuning on top of this representation leads to a more accurate predictor of the target data.
arXiv Detail & Related papers (2021-06-18T15:41:07Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - Empirical Evaluation of Supervision Signals for Style Transfer Models [44.39622949370144]
In this work we empirically compare the dominant optimization paradigms which provide supervision signals during training.
We find that backtranslation has model-specific limitations, which inhibits training style transfer models.
We also experiment with Minimum Risk Training, a popular technique in the machine translation community, which, to our knowledge, has not been empirically evaluated in the task of style transfer.
arXiv Detail & Related papers (2021-01-15T15:33:30Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z) - Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer
Learning [60.784641458579124]
We show that fine-tuning effectively enhances model robustness under white-box FGSM attacks.
We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model.
To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model.
arXiv Detail & Related papers (2020-08-25T15:04:32Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.