Boosting Gradient for White-Box Adversarial Attacks
- URL: http://arxiv.org/abs/2010.10712v1
- Date: Wed, 21 Oct 2020 02:13:26 GMT
- Title: Boosting Gradient for White-Box Adversarial Attacks
- Authors: Hongying Liu, Zhenyu Zhou, Fanhua Shang, Xiaoyu Qi, Yuanyuan Liu,
Licheng Jiao
- Abstract summary: We propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms.
Our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients.
- Score: 60.422511092730026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNNs) are playing key roles in various artificial
intelligence applications such as image classification and object recognition.
However, a growing number of studies have shown that there exist adversarial
examples in DNNs, which are almost imperceptibly different from original
samples, but can greatly change the network output. Existing white-box attack
algorithms can generate powerful adversarial examples. Nevertheless, most of
the algorithms concentrate on how to iteratively make the best use of gradients
to improve adversarial performance. In contrast, in this paper, we focus on the
properties of the widely-used ReLU activation function, and discover that there
exist two phenomena (i.e., wrong blocking and over transmission) misleading the
calculation of gradients in ReLU during the backpropagation. Both issues
enlarge the difference between the predicted changes of the loss function from
gradient and corresponding actual changes, and mislead the gradients which
results in larger perturbations. Therefore, we propose a universal adversarial
example generation method, called ADV-ReLU, to enhance the performance of
gradient based white-box attack algorithms. During the backpropagation of the
network, our approach calculates the gradient of the loss function versus
network input, maps the values to scores, and selects a part of them to update
the misleading gradients. Comprehensive experimental results on \emph{ImageNet}
demonstrate that our ADV-ReLU can be easily integrated into many
state-of-the-art gradient-based white-box attack algorithms, as well as
transferred to black-box attack attackers, to further decrease perturbations in
the ${\ell _2}$-norm.
Related papers
- Rethinking PGD Attack: Is Sign Function Necessary? [131.6894310945647]
We present a theoretical analysis of how such sign-based update algorithm influences step-wise attack performance.
We propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.
The effectiveness of the proposed RGD algorithm has been demonstrated extensively in experiments.
arXiv Detail & Related papers (2023-12-03T02:26:58Z) - Dynamics-aware Adversarial Attack of Adaptive Neural Networks [75.50214601278455]
We investigate the dynamics-aware adversarial attack problem of adaptive neural networks.
We propose a Leaded Gradient Method (LGM) and show the significant effects of the lagged gradient.
Our LGM achieves impressive adversarial attack performance compared with the dynamic-unaware attack methods.
arXiv Detail & Related papers (2022-10-15T01:32:08Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Visual Explanations from Deep Networks via Riemann-Stieltjes Integrated
Gradient-based Localization [0.24596929878045565]
We introduce a new technique to produce visual explanations for the predictions of a CNN.
Our method can be applied to any layer of the network, and like Integrated Gradients it is not affected by the problem of vanishing gradients.
Compared to Grad-CAM, heatmaps produced by our algorithm are better focused in the areas of interest, and their numerical computation is more stable.
arXiv Detail & Related papers (2022-05-22T18:30:38Z) - Sampling-based Fast Gradient Rescaling Method for Highly Transferable
Adversarial Attacks [19.917677500613788]
gradient-based approaches generally use the $sign$ function to generate perturbations at the end of the process.
We propose a Sampling-based Fast Gradient Rescaling Method (S-FGRM) to improve the transferability of crafted adversarial examples.
arXiv Detail & Related papers (2022-04-06T15:12:20Z) - Adaptive Perturbation for Adversarial Attack [50.77612889697216]
We propose a new gradient-based attack method for adversarial examples.
We use the exact gradient direction with a scaling factor for generating adversarial perturbations.
Our method exhibits higher transferability and outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-11-27T07:57:41Z) - IWA: Integrated Gradient based White-box Attacks for Fooling Deep Neural
Networks [4.739554342067529]
Adversarial White-box Adversarial example generation algorithms (IWA): IFPA and IUA.
We propose two gradient based White-box Adversarial example generation algorithms (IWA): IFPA and IUA.
We verify the effectiveness of the proposed algorithms on both structured and unstructured datasets, and we compare them with five baseline generation algorithms.
arXiv Detail & Related papers (2021-02-03T16:10:42Z) - Patch-wise Attack for Fooling Deep Neural Network [153.59832333877543]
We propose a patch-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models.
We significantly improve the success rate by 9.2% for defense models and 3.7% for normally trained models on average.
arXiv Detail & Related papers (2020-07-14T01:50:22Z) - Towards Sharper First-Order Adversary with Quantized Gradients [43.02047596005796]
adversarial training has been the most successful defense against adversarial attacks.
In state-of-the-art first-order attacks, adversarial examples with sign gradients retain the sign information of each gradient component but discard the relative magnitude between components.
Gradient quantization not only preserves the sign information, but also keeps the relative magnitude between components.
arXiv Detail & Related papers (2020-02-01T14:33:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.