Blackbox Attacks via Surrogate Ensemble Search
- URL: http://arxiv.org/abs/2208.03610v1
- Date: Sun, 7 Aug 2022 01:24:11 GMT
- Title: Blackbox Attacks via Surrogate Ensemble Search
- Authors: Zikui Cai, Chengyu Song, Srikanth Krishnamurthy, Amit Roy-Chowdhury,
M. Salman Asif
- Abstract summary: We propose a novel method for blackbox attacks via surrogate ensemble search (BASES)
We show that our proposed method achieves better success rate with at least 30x fewer queries compared to state-of-the-art methods.
Our method is also effective on Google Cloud Vision API and achieved a 91% non-targeted attack success rate with 2.9 queries per image.
- Score: 18.413568112132197
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Blackbox adversarial attacks can be categorized into transfer- and
query-based attacks. Transfer methods do not require any feedback from the
victim model, but provide lower success rates compared to query-based methods.
Query attacks often require a large number of queries for success. To achieve
the best of both approaches, recent efforts have tried to combine them, but
still require hundreds of queries to achieve high success rates (especially for
targeted attacks). In this paper, we propose a novel method for blackbox
attacks via surrogate ensemble search (BASES) that can generate highly
successful blackbox attacks using an extremely small number of queries. We
first define a perturbation machine that generates a perturbed image by
minimizing a weighted loss function over a fixed set of surrogate models. To
generate an attack for a given victim model, we search over the weights in the
loss function using queries generated by the perturbation machine. Since the
dimension of the search space is small (same as the number of surrogate
models), the search requires a small number of queries. We demonstrate that our
proposed method achieves better success rate with at least 30x fewer queries
compared to state-of-the-art methods on different image classifiers trained
with ImageNet (including VGG-19, DenseNet-121, and ResNext-50). In particular,
our method requires as few as 3 queries per image (on average) to achieve more
than a 90% success rate for targeted attacks and 1-2 queries per image for over
a 99% success rate for non-targeted attacks. Our method is also effective on
Google Cloud Vision API and achieved a 91% non-targeted attack success rate
with 2.9 queries per image. We also show that the perturbations generated by
our proposed method are highly transferable and can be adopted for hard-label
blackbox attacks.
Related papers
- AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - Hard-label based Small Query Black-box Adversarial Attack [2.041108289731398]
We propose a new practical setting of hard label based attack with an optimisation process guided by a pretrained surrogate model.
We find the proposed method achieves approximately 5 times higher attack success rate compared to the benchmarks.
arXiv Detail & Related papers (2024-03-09T21:26:22Z) - Generalizable Black-Box Adversarial Attack with Meta Learning [54.196613395045595]
In black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful perturbation based on query feedback under a query budget.
We propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability.
The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance.
arXiv Detail & Related papers (2023-01-01T07:24:12Z) - Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - Distributed Black-box Attack: Do Not Overestimate Black-box Attacks [4.764637544913963]
Black-box adversarial attacks can fool image classifiers into misclassifying images without requiring access to model structure and weights.
Recent studies have reported attack success rates of over 95% with less than 1,000 queries.
This paper applies black-box attacks directly to cloud APIs rather than to local models.
arXiv Detail & Related papers (2022-10-28T19:14:03Z) - A Strong Baseline for Query Efficient Attacks in a Black Box Setting [3.52359746858894]
We propose a query efficient attack strategy to generate plausible adversarial examples on text classification and entailment tasks.
Our attack jointly leverages attention mechanism and locality sensitive hashing (LSH) to reduce the query count.
arXiv Detail & Related papers (2021-09-10T10:46:32Z) - QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval [56.51916317628536]
We study the query-based attack against image retrieval to evaluate its robustness against adversarial examples under the black-box setting.
A new relevance-based loss is designed to quantify the attack effects by measuring the set similarity on the top-k retrieval results before and after attacks.
Experiments show that the proposed attack achieves a high attack success rate with few queries against the image retrieval systems under the black-box setting.
arXiv Detail & Related papers (2021-03-04T10:18:43Z) - Simple and Efficient Hard Label Black-box Adversarial Attacks in Low
Query Budget Regimes [80.9350052404617]
We propose a simple and efficient Bayesian Optimization(BO) based approach for developing black-box adversarial attacks.
Issues with BO's performance in high dimensions are avoided by searching for adversarial examples in a structured low-dimensional subspace.
Our proposed approach consistently achieves 2x to 10x higher attack success rate while requiring 10x to 20x fewer queries.
arXiv Detail & Related papers (2020-07-13T04:34:57Z) - RayS: A Ray Searching Method for Hard-label Adversarial Attack [99.72117609513589]
We present the Ray Searching attack (RayS), which greatly improves the hard-label attack effectiveness as well as efficiency.
RayS attack can also be used as a sanity check for possible "falsely robust" models.
arXiv Detail & Related papers (2020-06-23T07:01:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.