Microbial Genetic Algorithm-based Black-box Attack against Interpretable
Deep Learning Systems
- URL: http://arxiv.org/abs/2307.06496v1
- Date: Thu, 13 Jul 2023 00:08:52 GMT
- Title: Microbial Genetic Algorithm-based Black-box Attack against Interpretable
Deep Learning Systems
- Authors: Eldor Abdukhamidov, Mohammed Abuhamad, Simon S. Woo, Eric Chan-Tin,
Tamer Abuhmed
- Abstract summary: In white-box environments, interpretable deep learning systems (IDLSes) have been shown to be vulnerable to malicious manipulations.
We propose a Query-efficient Score-based black-box attack against IDLSes, QuScore, which requires no knowledge of the target model and its coupled interpretation model.
- Score: 16.13790238416691
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models are susceptible to adversarial samples in white and
black-box environments. Although previous studies have shown high attack
success rates, coupling DNN models with interpretation models could offer a
sense of security when a human expert is involved, who can identify whether a
given sample is benign or malicious. However, in white-box environments,
interpretable deep learning systems (IDLSes) have been shown to be vulnerable
to malicious manipulations. In black-box settings, as access to the components
of IDLSes is limited, it becomes more challenging for the adversary to fool the
system. In this work, we propose a Query-efficient Score-based black-box attack
against IDLSes, QuScore, which requires no knowledge of the target model and
its coupled interpretation model. QuScore is based on transfer-based and
score-based methods by employing an effective microbial genetic algorithm. Our
method is designed to reduce the number of queries necessary to carry out
successful attacks, resulting in a more efficient process. By continuously
refining the adversarial samples created based on feedback scores from the
IDLS, our approach effectively navigates the search space to identify
perturbations that can fool the system. We evaluate the attack's effectiveness
on four CNN models (Inception, ResNet, VGG, DenseNet) and two interpretation
models (CAM, Grad), using both ImageNet and CIFAR datasets. Our results show
that the proposed approach is query-efficient with a high attack success rate
that can reach between 95% and 100% and transferability with an average success
rate of 69% in the ImageNet and CIFAR datasets. Our attack method generates
adversarial examples with attribution maps that resemble benign samples. We
have also demonstrated that our attack is resilient against various
preprocessing defense techniques and can easily be transferred to different DNN
models.
Related papers
- BruSLeAttack: A Query-Efficient Score-Based Black-Box Sparse Adversarial Attack [22.408968332454062]
We study the unique, less-well understood problem of generating sparse adversarial samples simply by observing the score-based replies to model queries.
We develop the BruSLeAttack-a new, faster (more query-efficient) algorithm for the problem.
Our work facilitates faster evaluation of model vulnerabilities and raises our vigilance on the safety, security and reliability of deployed systems.
arXiv Detail & Related papers (2024-04-08T08:59:26Z) - DALA: A Distribution-Aware LoRA-Based Adversarial Attack against
Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data.
Recent attack methods can achieve a relatively high attack success rate (ASR)
We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z) - Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - Instance Attack:An Explanation-based Vulnerability Analysis Framework
Against DNNs for Malware Detection [0.0]
We propose the notion of the instance-based attack.
Our scheme is interpretable and can work in a black-box environment.
Our method operates in black-box settings and the results can be validated with domain knowledge.
arXiv Detail & Related papers (2022-09-06T12:41:20Z) - Attackar: Attack of the Evolutionary Adversary [0.0]
This paper introduces textitAttackar, an evolutionary, score-based, black-box attack.
Attackar is based on a novel objective function that can be used in gradient-free optimization problems.
Our results demonstrate the superior performance of Attackar, both in terms of accuracy score and query efficiency.
arXiv Detail & Related papers (2022-08-17T13:57:23Z) - Practical No-box Adversarial Attacks with Training-free Hybrid Image
Transformation [123.33816363589506]
We show the existence of a textbftraining-free adversarial perturbation under the no-box threat model.
Motivated by our observation that high-frequency component (HFC) domains in low-level features, we attack an image mainly by manipulating its frequency components.
Our method is even competitive to mainstream transfer-based black-box attacks.
arXiv Detail & Related papers (2022-03-09T09:51:00Z) - Mitigating the Impact of Adversarial Attacks in Very Deep Networks [10.555822166916705]
Deep Neural Network (DNN) models have vulnerabilities related to security concerns.
Data poisoning-enabled perturbation attacks are complex adversarial ones that inject false data into models.
We propose an attack-agnostic-based defense method for mitigating their influence.
arXiv Detail & Related papers (2020-12-08T21:25:44Z) - Improving Query Efficiency of Black-box Adversarial Attack [75.71530208862319]
We propose a Neural Process based black-box adversarial attack (NP-Attack)
NP-Attack could greatly decrease the query counts under the black-box setting.
arXiv Detail & Related papers (2020-09-24T06:22:56Z) - Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer
Learning [60.784641458579124]
We show that fine-tuning effectively enhances model robustness under white-box FGSM attacks.
We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model.
To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model.
arXiv Detail & Related papers (2020-08-25T15:04:32Z) - Black-box Adversarial Sample Generation Based on Differential Evolution [18.82850158275813]
We propose a black-box technique to test the robustness of Deep Neural Networks (DNNs)
The technique does not require any knowledge of the structure or weights of the target DNN.
Experimental results show that our technique can achieve 100% success in generating adversarial samples to trigger misclassification.
arXiv Detail & Related papers (2020-07-30T08:43:45Z) - BERT-ATTACK: Adversarial Attack Against BERT Using BERT [77.82947768158132]
Adrial attacks for discrete data (such as texts) are more challenging than continuous data (such as images)
We propose textbfBERT-Attack, a high-quality and effective method to generate adversarial samples.
Our method outperforms state-of-the-art attack strategies in both success rate and perturb percentage.
arXiv Detail & Related papers (2020-04-21T13:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.