Universal Distributional Decision-based Black-box Adversarial Attack
with Reinforcement Learning
- URL: http://arxiv.org/abs/2211.08384v1
- Date: Tue, 15 Nov 2022 18:30:18 GMT
- Title: Universal Distributional Decision-based Black-box Adversarial Attack
with Reinforcement Learning
- Authors: Yiran Huang, Yexu Zhou, Michael Hefenbrock, Till Riedel, Likun Fang,
Michael Beigl
- Abstract summary: We propose a pixel-wise decision-based attack algorithm that finds a distribution of adversarial perturbation through a reinforcement learning algorithm.
Experiments show that the proposed approach outperforms state-of-the-art decision-based attacks with a higher attack success rate and greater transferability.
- Score: 5.240772699480865
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The vulnerability of the high-performance machine learning models implies a
security risk in applications with real-world consequences. Research on
adversarial attacks is beneficial in guiding the development of machine
learning models on the one hand and finding targeted defenses on the other.
However, most of the adversarial attacks today leverage the gradient or logit
information from the models to generate adversarial perturbation. Works in the
more realistic domain: decision-based attacks, which generate adversarial
perturbation solely based on observing the output label of the targeted model,
are still relatively rare and mostly use gradient-estimation strategies. In
this work, we propose a pixel-wise decision-based attack algorithm that finds a
distribution of adversarial perturbation through a reinforcement learning
algorithm. We call this method Decision-based Black-box Attack with
Reinforcement learning (DBAR). Experiments show that the proposed approach
outperforms state-of-the-art decision-based attacks with a higher attack
success rate and greater transferability.
Related papers
- Indiscriminate Disruption of Conditional Inference on Multivariate Gaussians [60.22542847840578]
Despite advances in adversarial machine learning, inference for Gaussian models in the presence of an adversary is notably understudied.
We consider a self-interested attacker who wishes to disrupt a decisionmaker's conditional inference and subsequent actions by corrupting a set of evidentiary variables.
To avoid detection, the attacker also desires the attack to appear plausible wherein plausibility is determined by the density of the corrupted evidence.
arXiv Detail & Related papers (2024-11-21T17:46:55Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - Defense Against Model Extraction Attacks on Recommender Systems [53.127820987326295]
We introduce Gradient-based Ranking Optimization (GRO) to defend against model extraction attacks on recommender systems.
GRO aims to minimize the loss of the protected target model while maximizing the loss of the attacker's surrogate model.
Results show GRO's superior effectiveness in defending against model extraction attacks.
arXiv Detail & Related papers (2023-10-25T03:30:42Z) - Enhancing Adversarial Attacks: The Similar Target Method [6.293148047652131]
adversarial examples pose a threat to deep neural networks' applications.
Deep neural networks are vulnerable to adversarial examples, posing a threat to the models' applications and raising security concerns.
We propose a similar targeted attack method named Similar Target(ST)
arXiv Detail & Related papers (2023-08-21T14:16:36Z) - Decision-BADGE: Decision-based Adversarial Batch Attack with Directional
Gradient Estimation [0.0]
Decision-BADGE is a novel method to craft universal adversarial perturbations for executing decision-based black-box attacks.
Our proposed method shows a superior success rate with less training time.
The research also shows that Decision-BADGE can successfully deceive unseen victim models and accurately target specific classes.
arXiv Detail & Related papers (2023-03-09T01:42:43Z) - Resisting Deep Learning Models Against Adversarial Attack
Transferability via Feature Randomization [17.756085566366167]
We propose a feature randomization-based approach that resists eight adversarial attacks targeting deep learning models.
Our methodology can secure the target network and resists adversarial attack transferability by over 60%.
arXiv Detail & Related papers (2022-09-11T20:14:12Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process.
The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z) - Poisoning Attacks on Algorithmic Fairness [14.213638219685656]
We introduce an optimization framework for poisoning attacks against algorithmic fairness.
We develop a gradient-based poisoning attack aimed at introducing classification disparities among different groups in the data.
We believe that our findings pave the way towards the definition of an entirely novel set of adversarial attacks targeting algorithmic fairness in different scenarios.
arXiv Detail & Related papers (2020-04-15T08:07:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.