Related papers: Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks

Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks

URL: http://arxiv.org/abs/2111.01714v1
Date: Tue, 2 Nov 2021 16:28:08 GMT
Title: Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks
Authors: Maksym Yatsura, Jan Hendrik Metzen, Matthias Hein
Abstract summary: Aversarial attacks based on randomized search schemes have obtained state-of-the-art results in black-box robustness evaluation. We study how this issue can be addressed by adapting the proposal distribution online based on the information obtained during the attack.
Score: 62.769451246845065
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial attacks based on randomized search schemes have obtained state-of-the-art results in black-box robustness evaluation recently. However, as we demonstrate in this work, their efficiency in different query budget regimes depends on manual design and heuristic tuning of the underlying proposal distributions. We study how this issue can be addressed by adapting the proposal distribution online based on the information obtained during the attack. We consider Square Attack, which is a state-of-the-art score-based black-box attack, and demonstrate how its performance can be improved by a learned controller that adjusts the parameters of the proposal distribution online during the attack. We train the controller using gradient-based end-to-end training on a CIFAR10 model with white box access. We demonstrate that plugging the learned controller into the attack consistently improves its black-box robustness estimate in different query regimes by up to 20% for a wide range of different models with black-box access. We further show that the learned adaptation principle transfers well to the other data distributions such as CIFAR100 or ImageNet and to the targeted attack setting.

Related papers

On Transfer-based Universal Attacks in Pure Black-box Setting [94.92884394009288]
We study the role of prior knowledge of the target model data and number of classes in attack performance. We also provide several interesting insights based on our analysis, and demonstrate that priors cause overestimation in transferability scores.
arXiv Detail & Related papers (2025-04-11T10:41:20Z)
Indiscriminate Disruption of Conditional Inference on Multivariate Gaussians [60.22542847840578]
Despite advances in adversarial machine learning, inference for Gaussian models in the presence of an adversary is notably understudied. We consider a self-interested attacker who wishes to disrupt a decisionmaker's conditional inference and subsequent actions by corrupting a set of evidentiary variables. To avoid detection, the attacker also desires the attack to appear plausible wherein plausibility is determined by the density of the corrupted evidence.
arXiv Detail & Related papers (2024-11-21T17:46:55Z)
AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples [26.37278338032268]
Adrial examples are typically optimized with gradient-based attacks. Each is shown to outperform its predecessors using different experimental setups. This provides overly-optimistic and even biased evaluations.
arXiv Detail & Related papers (2024-04-30T11:19:05Z)
Hard-label based Small Query Black-box Adversarial Attack [2.041108289731398]
We propose a new practical setting of hard label based attack with an optimisation process guided by a pretrained surrogate model. We find the proposed method achieves approximately 5 times higher attack success rate compared to the benchmarks.
arXiv Detail & Related papers (2024-03-09T21:26:22Z)
Breaking the Black-Box: Confidence-Guided Model Inversion Attack for Distribution Shift [0.46040036610482665]
Model inversion attacks (MIAs) seek to infer the private training data of a target classifier by generating synthetic images that reflect the characteristics of the target class. Previous studies have relied on full access to the target model, which is not practical in real-world scenarios. This paper proposes a textbfConfidence-textbfGuided textbfModel textbfInversion attack method called CG-MI.
arXiv Detail & Related papers (2024-02-28T03:47:17Z)
Understanding the Robustness of Randomized Feature Defense Against Query-Based Adversarial Attacks [23.010308600769545]
Deep neural networks are vulnerable to adversarial examples that find samples close to the original image but can make the model misclassify. We propose a simple and lightweight defense against black-box attacks by adding random noise to hidden features at intermediate layers of the model at inference time. Our method effectively enhances the model's resilience against both score-based and decision-based black-box attacks.
arXiv Detail & Related papers (2023-10-01T03:53:23Z)
Carefully Blending Adversarial Training, Purification, and Aggregation Improves Adversarial Robustness [1.2289361708127877]
CARSO is able to defend itself against adaptive end-to-end white-box attacks devised for defences. Our method improves by a significant margin the state-of-the-art for Cifar-10, Cifar-100, and TinyImageNet-200.
arXiv Detail & Related papers (2023-05-25T09:04:31Z)
Query Efficient Cross-Dataset Transferable Black-Box Attack on Action Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems. We propose a new attack on action recognition that addresses these shortcomings by generating perturbations. Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z)
Local Black-box Adversarial Attacks: A Query Efficient Approach [64.98246858117476]
Adrial attacks have threatened the application of deep neural networks in security-sensitive scenarios. We propose a novel framework to perturb the discriminative areas of clean examples only within limited queries in black-box attacks. We conduct extensive experiments to show that our framework can significantly improve the query efficiency during black-box perturbing with a high attack success rate.
arXiv Detail & Related papers (2021-01-04T15:32:16Z)
Simple and Efficient Hard Label Black-box Adversarial Attacks in Low Query Budget Regimes [80.9350052404617]
We propose a simple and efficient Bayesian Optimization(BO) based approach for developing black-box adversarial attacks. Issues with BO's performance in high dimensions are avoided by searching for adversarial examples in a structured low-dimensional subspace. Our proposed approach consistently achieves 2x to 10x higher attack success rate while requiring 10x to 20x fewer queries.
arXiv Detail & Related papers (2020-07-13T04:34:57Z)
Diversity can be Transferred: Output Diversification for White- and Black-box Attacks [89.92353493977173]
Adrial attacks often involve random perturbations of the inputs drawn from uniform or Gaussian distributions, e.g., to initialize optimization-based white-box attacks or generate update directions in black-box attacks. We propose Output Diversified Sampling (ODS), a novel sampling strategy that attempts to maximize diversity in the target model's outputs among the generated samples. ODS significantly improves the performance of existing white-box and black-box attacks. In particular, ODS reduces the number of queries needed for state-of-the-art black-box attacks on ImageNet by a factor of two.
arXiv Detail & Related papers (2020-03-15T17:49:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.