Meta-Learning the Search Distribution of Black-Box Random Search Based
Adversarial Attacks
- URL: http://arxiv.org/abs/2111.01714v1
- Date: Tue, 2 Nov 2021 16:28:08 GMT
- Title: Meta-Learning the Search Distribution of Black-Box Random Search Based
Adversarial Attacks
- Authors: Maksym Yatsura, Jan Hendrik Metzen, Matthias Hein
- Abstract summary: Aversarial attacks based on randomized search schemes have obtained state-of-the-art results in black-box robustness evaluation.
We study how this issue can be addressed by adapting the proposal distribution online based on the information obtained during the attack.
- Score: 62.769451246845065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks based on randomized search schemes have obtained
state-of-the-art results in black-box robustness evaluation recently. However,
as we demonstrate in this work, their efficiency in different query budget
regimes depends on manual design and heuristic tuning of the underlying
proposal distributions. We study how this issue can be addressed by adapting
the proposal distribution online based on the information obtained during the
attack. We consider Square Attack, which is a state-of-the-art score-based
black-box attack, and demonstrate how its performance can be improved by a
learned controller that adjusts the parameters of the proposal distribution
online during the attack. We train the controller using gradient-based
end-to-end training on a CIFAR10 model with white box access. We demonstrate
that plugging the learned controller into the attack consistently improves its
black-box robustness estimate in different query regimes by up to 20% for a
wide range of different models with black-box access. We further show that the
learned adaptation principle transfers well to the other data distributions
such as CIFAR100 or ImageNet and to the targeted attack setting.
Related papers
- AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples [26.37278338032268]
Adrial examples are typically optimized with gradient-based attacks.
Each is shown to outperform its predecessors using different experimental setups.
This provides overly-optimistic and even biased evaluations.
arXiv Detail & Related papers (2024-04-30T11:19:05Z) - Hard-label based Small Query Black-box Adversarial Attack [2.041108289731398]
We propose a new practical setting of hard label based attack with an optimisation process guided by a pretrained surrogate model.
We find the proposed method achieves approximately 5 times higher attack success rate compared to the benchmarks.
arXiv Detail & Related papers (2024-03-09T21:26:22Z) - Breaking the Black-Box: Confidence-Guided Model Inversion Attack for
Distribution Shift [0.46040036610482665]
Model inversion attacks (MIAs) seek to infer the private training data of a target classifier by generating synthetic images that reflect the characteristics of the target class.
Previous studies have relied on full access to the target model, which is not practical in real-world scenarios.
This paper proposes a textbfConfidence-textbfGuided textbfModel textbfInversion attack method called CG-MI.
arXiv Detail & Related papers (2024-02-28T03:47:17Z) - Understanding the Robustness of Randomized Feature Defense Against
Query-Based Adversarial Attacks [23.010308600769545]
Deep neural networks are vulnerable to adversarial examples that find samples close to the original image but can make the model misclassify.
We propose a simple and lightweight defense against black-box attacks by adding random noise to hidden features at intermediate layers of the model at inference time.
Our method effectively enhances the model's resilience against both score-based and decision-based black-box attacks.
arXiv Detail & Related papers (2023-10-01T03:53:23Z) - Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - Meta Gradient Adversarial Attack [64.5070788261061]
This paper proposes a novel architecture called Metaversa Gradient Adrial Attack (MGAA), which is plug-and-play and can be integrated with any existing gradient-based attack method.
Specifically, we randomly sample multiple models from a model zoo to compose different tasks and iteratively simulate a white-box attack and a black-box attack in each task.
By narrowing the gap between the gradient directions in white-box and black-box attacks, the transferability of adversarial examples on the black-box setting can be improved.
arXiv Detail & Related papers (2021-08-09T17:44:19Z) - Local Black-box Adversarial Attacks: A Query Efficient Approach [64.98246858117476]
Adrial attacks have threatened the application of deep neural networks in security-sensitive scenarios.
We propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks.
We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate.
arXiv Detail & Related papers (2021-01-04T15:32:16Z) - Simple and Efficient Hard Label Black-box Adversarial Attacks in Low
Query Budget Regimes [80.9350052404617]
We propose a simple and efficient Bayesian Optimization(BO) based approach for developing black-box adversarial attacks.
Issues with BO's performance in high dimensions are avoided by searching for adversarial examples in a structured low-dimensional subspace.
Our proposed approach consistently achieves 2x to 10x higher attack success rate while requiring 10x to 20x fewer queries.
arXiv Detail & Related papers (2020-07-13T04:34:57Z) - Diversity can be Transferred: Output Diversification for White- and
Black-box Attacks [89.92353493977173]
Adrial attacks often involve random perturbations of the inputs drawn from uniform or Gaussian distributions, e.g., to initialize optimization-based white-box attacks or generate update directions in black-box attacks.
We propose Output Diversified Sampling (ODS), a novel sampling strategy that attempts to maximize diversity in the target model's outputs among the generated samples.
ODS significantly improves the performance of existing white-box and black-box attacks.
In particular, ODS reduces the number of queries needed for state-of-the-art black-box attacks on ImageNet by a factor of two.
arXiv Detail & Related papers (2020-03-15T17:49:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.