Model Inversion Attack against Transfer Learning: Inverting a Model
without Accessing It
- URL: http://arxiv.org/abs/2203.06570v1
- Date: Sun, 13 Mar 2022 05:07:02 GMT
- Title: Model Inversion Attack against Transfer Learning: Inverting a Model
without Accessing It
- Authors: Dayong Ye and Huiqiang Chen and Shuai Zhou and Tianqing Zhu and Wanlei
Zhou and Shouling Ji
- Abstract summary: Transfer learning is an important approach that produces pre-trained teacher models.
Recent research on transfer learning has found that it is vulnerable to various attacks.
It is still not clear whether transfer learning is vulnerable to model inversion attacks.
- Score: 41.39995986856193
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transfer learning is an important approach that produces pre-trained teacher
models which can be used to quickly build specialized student models. However,
recent research on transfer learning has found that it is vulnerable to various
attacks, e.g., misclassification and backdoor attacks. However, it is still not
clear whether transfer learning is vulnerable to model inversion attacks.
Launching a model inversion attack against transfer learning scheme is
challenging. Not only does the student model hide its structural parameters,
but it is also inaccessible to the adversary. Hence, when targeting a student
model, both the white-box and black-box versions of existing model inversion
attacks fail. White-box attacks fail as they need the target model's
parameters. Black-box attacks fail as they depend on making repeated queries of
the target model. However, they may not mean that transfer learning models are
impervious to model inversion attacks. Hence, with this paper, we initiate
research into model inversion attacks against transfer learning schemes with
two novel attack methods. Both are black-box attacks, suiting different
situations, that do not rely on queries to the target student model. In the
first method, the adversary has the data samples that share the same
distribution as the training set of the teacher model. In the second method,
the adversary does not have any such samples. Experiments show that highly
recognizable data records can be recovered with both of these methods. This
means that even if a model is an inaccessible black-box, it can still be
inverted.
Related papers
- Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - Get a Model! Model Hijacking Attack Against Machine Learning Models [30.346469782056406]
We propose a new training time attack against computer vision based machine learning models, namely model hijacking attack.
adversary aims to hijack a target model to execute a different task without the model owner noticing.
Our evaluation shows that both of our model hijacking attacks achieve a high attack success rate, with a negligible drop in model utility.
arXiv Detail & Related papers (2021-11-08T11:30:50Z) - Adversarial Transfer Attacks With Unknown Data and Class Overlap [19.901933940805684]
Current transfer attack research has an unrealistic advantage for the attacker.
We present the first study of transferring adversarial attacks focusing on the data available to attacker and victim under imperfect settings.
This threat model is relevant to applications in medicine, malware, and others.
arXiv Detail & Related papers (2021-09-23T03:41:34Z) - Manipulating SGD with Data Ordering Attacks [23.639512087220137]
We present a class of training-time attacks that require no changes to the underlying model dataset or architecture.
In particular, an attacker can disrupt the integrity and availability of a model by simply reordering training batches.
Attacks have a long-term impact in that they decrease model performance hundreds of epochs after the attack took place.
arXiv Detail & Related papers (2021-04-19T22:17:27Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer
Learning [60.784641458579124]
We show that fine-tuning effectively enhances model robustness under white-box FGSM attacks.
We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model.
To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model.
arXiv Detail & Related papers (2020-08-25T15:04:32Z) - Adversarial Imitation Attack [63.76805962712481]
A practical adversarial attack should require as little as possible knowledge of attacked models.
Current substitute attacks need pre-trained models to generate adversarial examples.
In this study, we propose a novel adversarial imitation attack.
arXiv Detail & Related papers (2020-03-28T10:02:49Z) - DaST: Data-free Substitute Training for Adversarial Attacks [55.76371274622313]
We propose a data-free substitute training method (DaST) to obtain substitute models for adversarial black-box attacks.
To achieve this, DaST utilizes specially designed generative adversarial networks (GANs) to train the substitute models.
Experiments demonstrate the substitute models can achieve competitive performance compared with the baseline models.
arXiv Detail & Related papers (2020-03-28T04:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.