DaST: Data-free Substitute Training for Adversarial Attacks
- URL: http://arxiv.org/abs/2003.12703v2
- Date: Tue, 31 Mar 2020 15:25:06 GMT
- Title: DaST: Data-free Substitute Training for Adversarial Attacks
- Authors: Mingyi Zhou, Jing Wu, Yipeng Liu, Shuaicheng Liu, Ce Zhu
- Abstract summary: We propose a data-free substitute training method (DaST) to obtain substitute models for adversarial black-box attacks.
To achieve this, DaST utilizes specially designed generative adversarial networks (GANs) to train the substitute models.
Experiments demonstrate the substitute models can achieve competitive performance compared with the baseline models.
- Score: 55.76371274622313
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models are vulnerable to adversarial examples. For the
black-box setting, current substitute attacks need pre-trained models to
generate adversarial examples. However, pre-trained models are hard to obtain
in real-world tasks. In this paper, we propose a data-free substitute training
method (DaST) to obtain substitute models for adversarial black-box attacks
without the requirement of any real data. To achieve this, DaST utilizes
specially designed generative adversarial networks (GANs) to train the
substitute models. In particular, we design a multi-branch architecture and
label-control loss for the generative model to deal with the uneven
distribution of synthetic samples. The substitute model is then trained by the
synthetic samples generated by the generative model, which are labeled by the
attacked model subsequently. The experiments demonstrate the substitute models
produced by DaST can achieve competitive performance compared with the baseline
models which are trained by the same train set with attacked models.
Additionally, to evaluate the practicability of the proposed method on the
real-world task, we attack an online machine learning model on the Microsoft
Azure platform. The remote model misclassifies 98.35% of the adversarial
examples crafted by our method. To the best of our knowledge, we are the first
to train a substitute model for adversarial attacks without any real data.
Related papers
- Downstream Transfer Attack: Adversarial Attacks on Downstream Models with Pre-trained Vision Transformers [95.22517830759193]
This paper studies the transferability of such an adversarial vulnerability from a pre-trained ViT model to downstream tasks.
We show that DTA achieves an average attack success rate (ASR) exceeding 90%, surpassing existing methods by a huge margin.
arXiv Detail & Related papers (2024-08-03T08:07:03Z) - OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable
Evasion Attacks [17.584752814352502]
Evasion Attacks (EA) are used to test the robustness of trained neural networks by distorting input data.
We introduce a self-supervised, computationally economical method for generating adversarial examples.
Our experiments consistently demonstrate the method is effective across various models, unseen data categories, and even defended models.
arXiv Detail & Related papers (2023-10-05T17:34:47Z) - Scalable Membership Inference Attacks via Quantile Regression [35.33158339354343]
Membership inference attacks are designed to determine, using black box access to trained models, whether a particular example was used in training or not.
We introduce a new class of attacks based on performing quantile regression on the distribution of confidence scores induced by the model under attack on points that are not used in training.
arXiv Detail & Related papers (2023-07-07T16:07:00Z) - DST: Dynamic Substitute Training for Data-free Black-box Attack [79.61601742693713]
We propose a novel dynamic substitute training attack method to encourage substitute model to learn better and faster from the target model.
We introduce a task-driven graph-based structure information learning constrain to improve the quality of generated training data.
arXiv Detail & Related papers (2022-04-03T02:29:11Z) - MEGA: Model Stealing via Collaborative Generator-Substitute Networks [4.065949099860426]
Recent data-free model stealingmethods are shown effective to extract the knowledge of thetarget model without using real query examples.
We propose a data-free model stealing frame-work,MEGA, which is based on collaborative generator-substitute networks.
Our results show that theaccuracy of our trained substitute model and the adversarialattack success rate over it can be up to 33% and 40% higherthan state-of-the-art data-free black-box attacks.
arXiv Detail & Related papers (2022-01-31T09:34:28Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z) - Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer
Learning [60.784641458579124]
We show that fine-tuning effectively enhances model robustness under white-box FGSM attacks.
We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model.
To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model.
arXiv Detail & Related papers (2020-08-25T15:04:32Z) - Data-Free Adversarial Perturbations for Practical Black-Box Attack [25.44755251319056]
We present a data-free method for crafting adversarial perturbations that can fool a target model without any knowledge about the training data distribution.
Our method empirically shows that current deep learning models are still at risk even when the attackers do not have access to training data.
arXiv Detail & Related papers (2020-03-03T02:22:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.