AutoAdversary: A Pixel Pruning Method for Sparse Adversarial Attack
- URL: http://arxiv.org/abs/2203.09756v1
- Date: Fri, 18 Mar 2022 06:06:06 GMT
- Title: AutoAdversary: A Pixel Pruning Method for Sparse Adversarial Attack
- Authors: Jinqiao Li, Xiaotao Liu, Jian Zhao, Furao Shen
- Abstract summary: A special branch of adversarial examples, namely sparse adversarial examples, can fool the target DNNs by perturbing only a few pixels.
We propose a novel end-to-end sparse adversarial attack method, namely AutoAdversary, which can find the most important pixels automatically.
Experiments demonstrate the superiority of our proposed method over several state-of-the-art methods.
- Score: 8.926478245654703
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) have been proven to be vulnerable to adversarial
examples. A special branch of adversarial examples, namely sparse adversarial
examples, can fool the target DNNs by perturbing only a few pixels. However,
many existing sparse adversarial attacks use heuristic methods to select the
pixels to be perturbed, and regard the pixel selection and the adversarial
attack as two separate steps. From the perspective of neural network pruning,
we propose a novel end-to-end sparse adversarial attack method, namely
AutoAdversary, which can find the most important pixels automatically by
integrating the pixel selection into the adversarial attack. Specifically, our
method utilizes a trainable neural network to generate a binary mask for the
pixel selection. After jointly optimizing the adversarial perturbation and the
neural network, only the pixels corresponding to the value 1 in the mask are
perturbed. Experiments demonstrate the superiority of our proposed method over
several state-of-the-art methods. Furthermore, since AutoAdversary does not
require a heuristic pixel selection process, it does not slow down excessively
as other methods when the image size increases.
Related papers
- Optimizing One-pixel Black-box Adversarial Attacks [0.0]
The output of Deep Neural Networks (DNN) can be altered by a small perturbation of the input in a black box setting.
This work seeks to improve the One-pixel (few-pixel) black-box adversarial attacks to reduce the number of calls to the network under attack.
arXiv Detail & Related papers (2022-04-30T12:42:14Z) - Adversarial examples by perturbing high-level features in intermediate
decoder layers [0.0]
Instead of perturbing pixels, we use an encoder-decoder representation of the input image and perturb intermediate layers in the decoder.
Our perturbation possesses semantic meaning, such as a longer beak or green tints.
We show that our method modifies key features such as edges and that defence techniques based on adversarial training are vulnerable to our attacks.
arXiv Detail & Related papers (2021-10-14T07:08:15Z) - Sparse and Imperceptible Adversarial Attack via a Homotopy Algorithm [93.80082636284922]
Sparse adversarial attacks can fool deep networks (DNNs) by only perturbing a few pixels.
Recent efforts combine it with another l_infty perturbation on magnitudes.
We propose a homotopy algorithm to tackle the sparsity and neural perturbation framework.
arXiv Detail & Related papers (2021-06-10T20:11:36Z) - Transferable Sparse Adversarial Attack [62.134905824604104]
We introduce a generator architecture to alleviate the overfitting issue and thus efficiently craft transferable sparse adversarial examples.
Our method achieves superior inference speed, 700$times$ faster than other optimization-based methods.
arXiv Detail & Related papers (2021-05-31T06:44:58Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z) - Patch-wise++ Perturbation for Adversarial Targeted Attacks [132.58673733817838]
We propose a patch-wise iterative method (PIM) aimed at crafting adversarial examples with high transferability.
Specifically, we introduce an amplification factor to the step size in each iteration, and one pixel's overall gradient overflowing the $epsilon$-constraint is properly assigned to its surrounding regions.
Compared with the current state-of-the-art attack methods, we significantly improve the success rate by 35.9% for defense models and 32.7% for normally trained models.
arXiv Detail & Related papers (2020-12-31T08:40:42Z) - GreedyFool: Distortion-Aware Sparse Adversarial Attack [138.55076781355206]
Modern deep neural networks (DNNs) are vulnerable to adversarial samples.
Sparse adversarial samples can fool the target model by only perturbing a few pixels.
We propose a novel two-stage distortion-aware greedy-based method dubbed as "GreedyFool"
arXiv Detail & Related papers (2020-10-26T17:59:07Z) - Patch-wise Attack for Fooling Deep Neural Network [153.59832333877543]
We propose a patch-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models.
We significantly improve the success rate by 9.2% for defense models and 3.7% for normally trained models on average.
arXiv Detail & Related papers (2020-07-14T01:50:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.