Black-box Adversarial Sample Generation Based on Differential Evolution
- URL: http://arxiv.org/abs/2007.15310v1
- Date: Thu, 30 Jul 2020 08:43:45 GMT
- Title: Black-box Adversarial Sample Generation Based on Differential Evolution
- Authors: Junyu Lin, Lei Xu, Yingqi Liu, Xiangyu Zhang
- Abstract summary: We propose a black-box technique to test the robustness of Deep Neural Networks (DNNs)
The technique does not require any knowledge of the structure or weights of the target DNN.
Experimental results show that our technique can achieve 100% success in generating adversarial samples to trigger misclassification.
- Score: 18.82850158275813
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) are being used in various daily tasks such as
object detection, speech processing, and machine translation. However, it is
known that DNNs suffer from robustness problems -- perturbed inputs called
adversarial samples leading to misbehaviors of DNNs. In this paper, we propose
a black-box technique called Black-box Momentum Iterative Fast Gradient Sign
Method (BMI-FGSM) to test the robustness of DNN models. The technique does not
require any knowledge of the structure or weights of the target DNN. Compared
to existing white-box testing techniques that require accessing model internal
information such as gradients, our technique approximates gradients through
Differential Evolution and uses approximated gradients to construct adversarial
samples. Experimental results show that our technique can achieve 100% success
in generating adversarial samples to trigger misclassification, and over 95%
success in generating samples to trigger misclassification to a specific target
output label. It also demonstrates better perturbation distance and better
transferability. Compared to the state-of-the-art black-box technique, our
technique is more efficient. Furthermore, we conduct testing on the commercial
Aliyun API and successfully trigger its misbehavior within a limited number of
queries, demonstrating the feasibility of real-world black-box attack.
Related papers
- Microbial Genetic Algorithm-based Black-box Attack against Interpretable
Deep Learning Systems [16.13790238416691]
In white-box environments, interpretable deep learning systems (IDLSes) have been shown to be vulnerable to malicious manipulations.
We propose a Query-efficient Score-based black-box attack against IDLSes, QuScore, which requires no knowledge of the target model and its coupled interpretation model.
arXiv Detail & Related papers (2023-07-13T00:08:52Z) - General Adversarial Defense Against Black-box Attacks via Pixel Level
and Feature Level Distribution Alignments [75.58342268895564]
We use Deep Generative Networks (DGNs) with a novel training mechanism to eliminate the distribution gap.
The trained DGNs align the distribution of adversarial samples with clean ones for the target DNNs by translating pixel values.
Our strategy demonstrates its unique effectiveness and generality against black-box attacks.
arXiv Detail & Related papers (2022-12-11T01:51:31Z) - Instance Attack:An Explanation-based Vulnerability Analysis Framework
Against DNNs for Malware Detection [0.0]
We propose the notion of the instance-based attack.
Our scheme is interpretable and can work in a black-box environment.
Our method operates in black-box settings and the results can be validated with domain knowledge.
arXiv Detail & Related papers (2022-09-06T12:41:20Z) - Boosting Black-Box Adversarial Attacks with Meta Learning [0.0]
We propose a hybrid attack method which trains meta adversarial perturbations (MAPs) on surrogate models and performs black-box attacks by estimating gradients of the models.
Our method can not only improve the attack success rates, but also reduces the number of queries compared to other methods.
arXiv Detail & Related papers (2022-03-28T09:32:48Z) - How to Robustify Black-Box ML Models? A Zeroth-Order Optimization
Perspective [74.47093382436823]
We address the problem of black-box defense: How to robustify a black-box model using just input queries and output feedback?
We propose a general notion of defensive operation that can be applied to black-box models, and design it through the lens of denoised smoothing (DS)
We empirically show that ZO-AE-DS can achieve improved accuracy, certified robustness, and query complexity over existing baselines.
arXiv Detail & Related papers (2022-03-27T03:23:32Z) - Query-Efficient Black-box Adversarial Attacks Guided by a Transfer-based
Prior [50.393092185611536]
We consider the black-box adversarial setting, where the adversary needs to craft adversarial examples without access to the gradients of a target model.
Previous methods attempted to approximate the true gradient either by using the transfer gradient of a surrogate white-box model or based on the feedback of model queries.
We propose two prior-guided random gradient-free (PRGF) algorithms based on biased sampling and gradient averaging.
arXiv Detail & Related papers (2022-03-13T04:06:27Z) - Improving Query Efficiency of Black-box Adversarial Attack [75.71530208862319]
We propose a Neural Process based black-box adversarial attack (NP-Attack)
NP-Attack could greatly decrease the query counts under the black-box setting.
arXiv Detail & Related papers (2020-09-24T06:22:56Z) - Perturbing Across the Feature Hierarchy to Improve Standard and Strict
Blackbox Attack Transferability [100.91186458516941]
We consider the blackbox transfer-based targeted adversarial attack threat model in the realm of deep neural network (DNN) image classifiers.
We design a flexible attack framework that allows for multi-layer perturbations and demonstrates state-of-the-art targeted transfer performance.
We analyze why the proposed methods outperform existing attack strategies and show an extension of the method in the case when limited queries to the blackbox model are allowed.
arXiv Detail & Related papers (2020-04-29T16:00:13Z) - Towards Query-Efficient Black-Box Adversary with Zeroth-Order Natural
Gradient Descent [92.4348499398224]
Black-box adversarial attack methods have received special attentions owing to their practicality and simplicity.
We propose a zeroth-order natural gradient descent (ZO-NGD) method to design the adversarial attacks.
ZO-NGD can obtain significantly lower model query complexities compared with state-of-the-art attack methods.
arXiv Detail & Related papers (2020-02-18T21:48:54Z) - REST: Performance Improvement of a Black Box Model via RL-based Spatial
Transformation [15.691668909002892]
We study robustness to geometric transformations in a specific condition where the black-box image classifier is given.
We propose an additional learner, emphREinforcement Spatial Transform (REST), that transforms the warped input data into samples regarded as in-distribution by the black-box models.
arXiv Detail & Related papers (2020-02-16T16:15:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.