Related papers: Blackbox Attacks via Surrogate Ensemble Search

Blackbox Attacks via Surrogate Ensemble Search

URL: http://arxiv.org/abs/2208.03610v1
Date: Sun, 7 Aug 2022 01:24:11 GMT
Title: Blackbox Attacks via Surrogate Ensemble Search
Authors: Zikui Cai, Chengyu Song, Srikanth Krishnamurthy, Amit Roy-Chowdhury, M. Salman Asif
Abstract summary: We propose a novel method for blackbox attacks via surrogate ensemble search (BASES) We show that our proposed method achieves better success rate with at least 30x fewer queries compared to state-of-the-art methods. Our method is also effective on Google Cloud Vision API and achieved a 91% non-targeted attack success rate with 2.9 queries per image.
Score: 18.413568112132197
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Blackbox adversarial attacks can be categorized into transfer- and query-based attacks. Transfer methods do not require any feedback from the victim model, but provide lower success rates compared to query-based methods. Query attacks often require a large number of queries for success. To achieve the best of both approaches, recent efforts have tried to combine them, but still require hundreds of queries to achieve high success rates (especially for targeted attacks). In this paper, we propose a novel method for blackbox attacks via surrogate ensemble search (BASES) that can generate highly successful blackbox attacks using an extremely small number of queries. We first define a perturbation machine that generates a perturbed image by minimizing a weighted loss function over a fixed set of surrogate models. To generate an attack for a given victim model, we search over the weights in the loss function using queries generated by the perturbation machine. Since the dimension of the search space is small (same as the number of surrogate models), the search requires a small number of queries. We demonstrate that our proposed method achieves better success rate with at least 30x fewer queries compared to state-of-the-art methods on different image classifiers trained with ImageNet (including VGG-19, DenseNet-121, and ResNext-50). In particular, our method requires as few as 3 queries per image (on average) to achieve more than a 90% success rate for targeted attacks and 1-2 queries per image for over a 99% success rate for non-targeted attacks. Our method is also effective on Google Cloud Vision API and achieved a 91% non-targeted attack success rate with 2.9 queries per image. We also show that the perturbations generated by our proposed method are highly transferable and can be adopted for hard-label blackbox attacks.

Related papers

AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries. We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots. We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z)
Hard-label based Small Query Black-box Adversarial Attack [2.041108289731398]
We propose a new practical setting of hard label based attack with an optimisation process guided by a pretrained surrogate model. We find the proposed method achieves approximately 5 times higher attack success rate compared to the benchmarks.
arXiv Detail & Related papers (2024-03-09T21:26:22Z)
Generalizable Black-Box Adversarial Attack with Meta Learning [54.196613395045595]
In black-box adversarial attack, the target model's parameters are unknown, and the attacker aims to find a successful perturbation based on query feedback under a query budget. We propose to utilize the feedback information across historical attacks, dubbed example-level adversarial transferability. The proposed framework with the two types of adversarial transferability can be naturally combined with any off-the-shelf query-based attack methods to boost their performance.
arXiv Detail & Related papers (2023-01-01T07:24:12Z)
Query Efficient Cross-Dataset Transferable Black-Box Attack on Action Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems. We propose a new attack on action recognition that addresses these shortcomings by generating perturbations. Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z)
Distributed Black-box Attack: Do Not Overestimate Black-box Attacks [4.764637544913963]
Black-box adversarial attacks can fool image classifiers into misclassifying images without requiring access to model structure and weights. Recent studies have reported attack success rates of over 95% with less than 1,000 queries. This paper applies black-box attacks directly to cloud APIs rather than to local models.
arXiv Detail & Related papers (2022-10-28T19:14:03Z)
Attacking deep networks with surrogate-based adversarial black-box methods is easy [7.804269142923776]
A recent line of work on black-box adversarial attacks has revived the use of transfer from surrogate models. Here, we provide a short and simple algorithm which achieves state-of-the-art results through a search. The guiding assumption of the algorithm is that the studied networks are in a fundamental sense learning similar functions.
arXiv Detail & Related papers (2022-03-16T16:17:18Z)
A Strong Baseline for Query Efficient Attacks in a Black Box Setting [3.52359746858894]
We propose a query efficient attack strategy to generate plausible adversarial examples on text classification and entailment tasks. Our attack jointly leverages attention mechanism and locality sensitive hashing (LSH) to reduce the query count.
arXiv Detail & Related papers (2021-09-10T10:46:32Z)
QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval [56.51916317628536]
We study the query-based attack against image retrieval to evaluate its robustness against adversarial examples under the black-box setting. A new relevance-based loss is designed to quantify the attack effects by measuring the set similarity on the top-k retrieval results before and after attacks. Experiments show that the proposed attack achieves a high attack success rate with few queries against the image retrieval systems under the black-box setting.
arXiv Detail & Related papers (2021-03-04T10:18:43Z)
Simple and Efficient Hard Label Black-box Adversarial Attacks in Low Query Budget Regimes [80.9350052404617]
We propose a simple and efficient Bayesian Optimization(BO) based approach for developing black-box adversarial attacks. Issues with BO's performance in high dimensions are avoided by searching for adversarial examples in a structured low-dimensional subspace. Our proposed approach consistently achieves 2x to 10x higher attack success rate while requiring 10x to 20x fewer queries.
arXiv Detail & Related papers (2020-07-13T04:34:57Z)
RayS: A Ray Searching Method for Hard-label Adversarial Attack [99.72117609513589]
We present the Ray Searching attack (RayS), which greatly improves the hard-label attack effectiveness as well as efficiency. RayS attack can also be used as a sanity check for possible "falsely robust" models.
arXiv Detail & Related papers (2020-06-23T07:01:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.