Related papers: Boosting Gradient for White-Box Adversarial Attacks

Boosting Gradient for White-Box Adversarial Attacks

URL: http://arxiv.org/abs/2010.10712v1
Date: Wed, 21 Oct 2020 02:13:26 GMT
Title: Boosting Gradient for White-Box Adversarial Attacks
Authors: Hongying Liu, Zhenyu Zhou, Fanhua Shang, Xiaoyu Qi, Yuanyuan Liu, Licheng Jiao
Abstract summary: We propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms. Our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients.
Score: 60.422511092730026
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks (DNNs) are playing key roles in various artificial intelligence applications such as image classification and object recognition. However, a growing number of studies have shown that there exist adversarial examples in DNNs, which are almost imperceptibly different from original samples, but can greatly change the network output. Existing white-box attack algorithms can generate powerful adversarial examples. Nevertheless, most of the algorithms concentrate on how to iteratively make the best use of gradients to improve adversarial performance. In contrast, in this paper, we focus on the properties of the widely-used ReLU activation function, and discover that there exist two phenomena (i.e., wrong blocking and over transmission) misleading the calculation of gradients in ReLU during the backpropagation. Both issues enlarge the difference between the predicted changes of the loss function from gradient and corresponding actual changes, and mislead the gradients which results in larger perturbations. Therefore, we propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms. During the backpropagation of the network, our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients. Comprehensive experimental results on \emph{ImageNet} demonstrate that our ADV-ReLU can be easily integrated into many state-of-the-art gradient-based white-box attack algorithms, as well as transferred to black-box attack attackers, to further decrease perturbations in the ${\ell _2}$-norm.

Related papers

Rethinking PGD Attack: Is Sign Function Necessary? [131.6894310945647]
We present a theoretical analysis of how such sign-based update algorithm influences step-wise attack performance. We propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign. The effectiveness of the proposed RGD algorithm has been demonstrated extensively in experiments.
arXiv Detail & Related papers (2023-12-03T02:26:58Z)
Dynamics-aware Adversarial Attack of Adaptive Neural Networks [75.50214601278455]
We investigate the dynamics-aware adversarial attack problem of adaptive neural networks. We propose a Leaded Gradient Method (LGM) and show the significant effects of the lagged gradient. Our LGM achieves impressive adversarial attack performance compared with the dynamic-unaware attack methods.
arXiv Detail & Related papers (2022-10-15T01:32:08Z)
Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks. We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights. Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z)
Visual Explanations from Deep Networks via Riemann-Stieltjes Integrated Gradient-based Localization [0.24596929878045565]
We introduce a new technique to produce visual explanations for the predictions of a CNN. Our method can be applied to any layer of the network, and like Integrated Gradients it is not affected by the problem of vanishing gradients. Compared to Grad-CAM, heatmaps produced by our algorithm are better focused in the areas of interest, and their numerical computation is more stable.
arXiv Detail & Related papers (2022-05-22T18:30:38Z)
Sampling-based Fast Gradient Rescaling Method for Highly Transferable Adversarial Attacks [19.917677500613788]
gradient-based approaches generally use the $sign$ function to generate perturbations at the end of the process. We propose a Sampling-based Fast Gradient Rescaling Method (S-FGRM) to improve the transferability of crafted adversarial examples.
arXiv Detail & Related papers (2022-04-06T15:12:20Z)
Adaptive Perturbation for Adversarial Attack [50.77612889697216]
We propose a new gradient-based attack method for adversarial examples. We use the exact gradient direction with a scaling factor for generating adversarial perturbations. Our method exhibits higher transferability and outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-11-27T07:57:41Z)
IWA: Integrated Gradient based White-box Attacks for Fooling Deep Neural Networks [4.739554342067529]
Adversarial White-box Adversarial example generation algorithms (IWA): IFPA and IUA. We propose two gradient based White-box Adversarial example generation algorithms (IWA): IFPA and IUA. We verify the effectiveness of the proposed algorithms on both structured and unstructured datasets, and we compare them with five baseline generation algorithms.
arXiv Detail & Related papers (2021-02-03T16:10:42Z)
Patch-wise Attack for Fooling Deep Neural Network [153.59832333877543]
We propose a patch-wise iterative algorithm -- a black-box attack towards mainstream normally trained and defense models. We significantly improve the success rate by 9.2% for defense models and 3.7% for normally trained models on average.
arXiv Detail & Related papers (2020-07-14T01:50:22Z)
Towards Sharper First-Order Adversary with Quantized Gradients [43.02047596005796]
adversarial training has been the most successful defense against adversarial attacks. In state-of-the-art first-order attacks, adversarial examples with sign gradients retain the sign information of each gradient component but discard the relative magnitude between components. Gradient quantization not only preserves the sign information, but also keeps the relative magnitude between components.
arXiv Detail & Related papers (2020-02-01T14:33:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.