Attacking deep networks with surrogate-based adversarial black-box
methods is easy
- URL: http://arxiv.org/abs/2203.08725v1
- Date: Wed, 16 Mar 2022 16:17:18 GMT
- Title: Attacking deep networks with surrogate-based adversarial black-box
methods is easy
- Authors: Nicholas A. Lord, Romain Mueller, Luca Bertinetto
- Abstract summary: A recent line of work on black-box adversarial attacks has revived the use of transfer from surrogate models.
Here, we provide a short and simple algorithm which achieves state-of-the-art results through a search.
The guiding assumption of the algorithm is that the studied networks are in a fundamental sense learning similar functions.
- Score: 7.804269142923776
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A recent line of work on black-box adversarial attacks has revived the use of
transfer from surrogate models by integrating it into query-based search.
However, we find that existing approaches of this type underperform their
potential, and can be overly complicated besides. Here, we provide a short and
simple algorithm which achieves state-of-the-art results through a search which
uses the surrogate network's class-score gradients, with no need for other
priors or heuristics. The guiding assumption of the algorithm is that the
studied networks are in a fundamental sense learning similar functions, and
that a transfer attack from one to the other should thus be fairly "easy". This
assumption is validated by the extremely low query counts and failure rates
achieved: e.g. an untargeted attack on a VGG-16 ImageNet network using a
ResNet-152 as the surrogate yields a median query count of 6 at a success rate
of 99.9%. Code is available at https://github.com/fiveai/GFCS.
Related papers
- AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - Using Graph Theory for Improving Machine Learning-based Detection of
Cyber Attacks [4.465883551216819]
Early detection of network intrusions and cyber threats is one of the main pillars of cybersecurity.
One of the most effective approaches for this purpose is to analyze network traffic with the help of artificial intelligence algorithms.
arXiv Detail & Related papers (2024-02-12T18:44:02Z) - Microbial Genetic Algorithm-based Black-box Attack against Interpretable
Deep Learning Systems [16.13790238416691]
In white-box environments, interpretable deep learning systems (IDLSes) have been shown to be vulnerable to malicious manipulations.
We propose a Query-efficient Score-based black-box attack against IDLSes, QuScore, which requires no knowledge of the target model and its coupled interpretation model.
arXiv Detail & Related papers (2023-07-13T00:08:52Z) - Query Efficient Cross-Dataset Transferable Black-Box Attack on Action
Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems.
We propose a new attack on action recognition that addresses these shortcomings by generating perturbations.
Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z) - A Large-scale Multiple-objective Method for Black-box Attack against
Object Detection [70.00150794625053]
We propose to minimize the true positive rate and maximize the false positive rate, which can encourage more false positive objects to block the generation of new true positive bounding boxes.
We extend the standard Genetic Algorithm with Random Subset selection and Divide-and-Conquer, called GARSDC, which significantly improves the efficiency.
Compared with the state-of-art attack methods, GARSDC decreases by an average 12.0 in the mAP and queries by about 1000 times in extensive experiments.
arXiv Detail & Related papers (2022-09-16T08:36:42Z) - Blackbox Attacks via Surrogate Ensemble Search [18.413568112132197]
We propose a novel method for blackbox attacks via surrogate ensemble search (BASES)
We show that our proposed method achieves better success rate with at least 30x fewer queries compared to state-of-the-art methods.
Our method is also effective on Google Cloud Vision API and achieved a 91% non-targeted attack success rate with 2.9 queries per image.
arXiv Detail & Related papers (2022-08-07T01:24:11Z) - Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels.
Recent efforts combine it with another l_infty perturbation on magnitudes.
We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z) - Switching Transferable Gradient Directions for Query-Efficient Black-Box
Adversarial Attacks [38.91061792696202]
We propose a simple and highly query-efficient black-box adversarial attack named SWITCH.
SWITCH features a highly efficient and effective utilization of the gradient of a surrogate model.
Experimental results conducted on CIFAR-10, CIFAR-100 and TinyImageNet show that SWITCH achieves a satisfactory attack success rate.
arXiv Detail & Related papers (2020-09-15T15:55:08Z) - ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks.
Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z) - Targeted Attack for Deep Hashing based Retrieval [57.582221494035856]
We propose a novel method, dubbed deep hashing targeted attack (DHTA), to study the targeted attack on such retrieval.
We first formulate the targeted attack as a point-to-set optimization, which minimizes the average distance between the hash code of an adversarial example and those of a set of objects with the target label.
To balance the performance and perceptibility, we propose to minimize the Hamming distance between the hash code of the adversarial example and the anchor code under the $ellinfty$ restriction on the perturbation.
arXiv Detail & Related papers (2020-04-15T08:36:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.