Get a Model! Model Hijacking Attack Against Machine Learning Models
- URL: http://arxiv.org/abs/2111.04394v1
- Date: Mon, 8 Nov 2021 11:30:50 GMT
- Title: Get a Model! Model Hijacking Attack Against Machine Learning Models
- Authors: Ahmed Salem, Michael Backes, Yang Zhang
- Abstract summary: We propose a new training time attack against computer vision based machine learning models, namely model hijacking attack.
adversary aims to hijack a target model to execute a different task without the model owner noticing.
Our evaluation shows that both of our model hijacking attacks achieve a high attack success rate, with a negligible drop in model utility.
- Score: 30.346469782056406
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning (ML) has established itself as a cornerstone for various
critical applications ranging from autonomous driving to authentication
systems. However, with this increasing adoption rate of machine learning
models, multiple attacks have emerged. One class of such attacks is training
time attack, whereby an adversary executes their attack before or during the
machine learning model training. In this work, we propose a new training time
attack against computer vision based machine learning models, namely model
hijacking attack. The adversary aims to hijack a target model to execute a
different task than its original one without the model owner noticing. Model
hijacking can cause accountability and security risks since a hijacked model
owner can be framed for having their model offering illegal or unethical
services. Model hijacking attacks are launched in the same way as existing data
poisoning attacks. However, one requirement of the model hijacking attack is to
be stealthy, i.e., the data samples used to hijack the target model should look
similar to the model's original training dataset. To this end, we propose two
different model hijacking attacks, namely Chameleon and Adverse Chameleon,
based on a novel encoder-decoder style ML model, namely the Camouflager. Our
evaluation shows that both of our model hijacking attacks achieve a high attack
success rate, with a negligible drop in model utility.
Related papers
- Vera Verto: Multimodal Hijacking Attack [22.69532868255637]
A recent attack in this domain is the model hijacking attack, whereby an adversary hijacks a victim model to implement their own hijacking tasks.
We transform the model hijacking attack into a more general multimodal setting, where the hijacking and original tasks are performed on data of different modalities.
Our attack achieves 94%, 94%, and 95% attack success rate when using the Sogou news dataset to hijack STL10, CIFAR-10, and MNISTs.
arXiv Detail & Related papers (2024-07-31T19:37:06Z) - Model for Peanuts: Hijacking ML Models without Training Access is Possible [5.005171792255858]
Model hijacking is an attack where an adversary aims to hijack a victim model to execute a different task than its original one.
We propose a simple approach for model hijacking at inference time named SnatchML to classify unknown input samples.
We first propose a novel approach we call meta-unlearning, designed to help the model unlearn a potentially malicious task while training on the original dataset.
arXiv Detail & Related papers (2024-06-03T18:04:37Z) - SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models [74.58014281829946]
We analyze the effectiveness of several representative attacks/defenses, including model stealing attacks, membership inference attacks, and backdoor detection on public models.
Our evaluation empirically shows the performance of these attacks/defenses can vary significantly on public models compared to self-trained models.
arXiv Detail & Related papers (2023-10-19T11:49:22Z) - Isolation and Induction: Training Robust Deep Neural Networks against
Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers.
This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses.
In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z) - Are You Stealing My Model? Sample Correlation for Fingerprinting Deep
Neural Networks [86.55317144826179]
Previous methods always leverage the transferable adversarial examples as the model fingerprint.
We propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC)
SAC successfully defends against various model stealing attacks, even including adversarial training or transfer learning.
arXiv Detail & Related papers (2022-10-21T02:07:50Z) - Model Inversion Attack against Transfer Learning: Inverting a Model
without Accessing It [41.39995986856193]
Transfer learning is an important approach that produces pre-trained teacher models.
Recent research on transfer learning has found that it is vulnerable to various attacks.
It is still not clear whether transfer learning is vulnerable to model inversion attacks.
arXiv Detail & Related papers (2022-03-13T05:07:02Z) - Manipulating SGD with Data Ordering Attacks [23.639512087220137]
We present a class of training-time attacks that require no changes to the underlying model dataset or architecture.
In particular, an attacker can disrupt the integrity and availability of a model by simply reordering training batches.
Attacks have a long-term impact in that they decrease model performance hundreds of epochs after the attack took place.
arXiv Detail & Related papers (2021-04-19T22:17:27Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - Adversarial Imitation Attack [63.76805962712481]
A practical adversarial attack should require as little as possible knowledge of attacked models.
Current substitute attacks need pre-trained models to generate adversarial examples.
In this study, we propose a novel adversarial imitation attack.
arXiv Detail & Related papers (2020-03-28T10:02:49Z) - DaST: Data-free Substitute Training for Adversarial Attacks [55.76371274622313]
We propose a data-free substitute training method (DaST) to obtain substitute models for adversarial black-box attacks.
To achieve this, DaST utilizes specially designed generative adversarial networks (GANs) to train the substitute models.
Experiments demonstrate the substitute models can achieve competitive performance compared with the baseline models.
arXiv Detail & Related papers (2020-03-28T04:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.