Improving Adversarial Transferability with Gradient Refining
- URL: http://arxiv.org/abs/2105.04834v1
- Date: Tue, 11 May 2021 07:44:29 GMT
- Title: Improving Adversarial Transferability with Gradient Refining
- Authors: Guoqiu Wang, Huanqian Yan, Ying Guo, Xingxing Wei
- Abstract summary: Adversarial examples are crafted by adding human-imperceptible perturbations to original images.
Deep neural networks are vulnerable to adversarial examples, which are crafted by adding human-imperceptible perturbations to original images.
- Score: 7.045900712659982
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks are vulnerable to adversarial examples, which are
crafted by adding human-imperceptible perturbations to original images. Most
existing adversarial attack methods achieve nearly 100% attack success rates
under the white-box setting, but only achieve relatively low attack success
rates under the black-box setting. To improve the transferability of
adversarial examples for the black-box setting, several methods have been
proposed, e.g., input diversity, translation-invariant attack, and
momentum-based attack. In this paper, we propose a method named Gradient
Refining, which can further improve the adversarial transferability by
correcting useless gradients introduced by input diversity through multiple
transformations. Our method is generally applicable to many gradient-based
attack methods combined with input diversity. Extensive experiments are
conducted on the ImageNet dataset and our method can achieve an average
transfer success rate of 82.07% for three different models under single-model
setting, which outperforms the other state-of-the-art methods by a large margin
of 6.0% averagely. And we have applied the proposed method to the competition
CVPR 2021 Unrestricted Adversarial Attacks on ImageNet organized by Alibaba and
won the second place in attack success rates among 1558 teams.
Related papers
- Boosting Adversarial Attacks by Leveraging Decision Boundary Information [68.07365511533675]
gradients of different models are more similar on the decision boundary than in the original position.
We propose a Boundary Fitting Attack to improve transferability.
Our method obtains an average attack success rate of 58.2%, which is 10.8% higher than other state-of-the-art transfer-based attacks.
arXiv Detail & Related papers (2023-03-10T05:54:11Z) - Making Substitute Models More Bayesian Can Enhance Transferability of
Adversarial Examples [89.85593878754571]
transferability of adversarial examples across deep neural networks is the crux of many black-box attacks.
We advocate to attack a Bayesian model for achieving desirable transferability.
Our method outperforms recent state-of-the-arts by large margins.
arXiv Detail & Related papers (2023-02-10T07:08:13Z) - Query-Efficient Black-box Adversarial Attacks Guided by a Transfer-based
Prior [50.393092185611536]
We consider the black-box adversarial setting, where the adversary needs to craft adversarial examples without access to the gradients of a target model.
Previous methods attempted to approximate the true gradient either by using the transfer gradient of a surrogate white-box model or based on the feedback of model queries.
We propose two prior-guided random gradient-free (PRGF) algorithms based on biased sampling and gradient averaging.
arXiv Detail & Related papers (2022-03-13T04:06:27Z) - Adaptive Perturbation for Adversarial Attack [50.77612889697216]
We propose a new gradient-based attack method for adversarial examples.
We use the exact gradient direction with a scaling factor for generating adversarial perturbations.
Our method exhibits higher transferability and outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-11-27T07:57:41Z) - Boosting Transferability of Targeted Adversarial Examples via
Hierarchical Generative Networks [56.96241557830253]
Transfer-based adversarial attacks can effectively evaluate model robustness in the black-box setting.
We propose a conditional generative attacking model, which can generate the adversarial examples targeted at different classes.
Our method improves the success rates of targeted black-box attacks by a significant margin over the existing methods.
arXiv Detail & Related papers (2021-07-05T06:17:47Z) - Enhancing the Transferability of Adversarial Attacks through Variance
Tuning [6.5328074334512]
We propose a new method called variance tuning to enhance the class of iterative gradient based attack methods.
Empirical results on the standard ImageNet dataset demonstrate that our method could significantly improve the transferability of gradient-based adversarial attacks.
arXiv Detail & Related papers (2021-03-29T12:41:55Z) - Boosting Adversarial Transferability through Enhanced Momentum [50.248076722464184]
Deep learning models are vulnerable to adversarial examples crafted by adding human-imperceptible perturbations on benign images.
Various momentum iterative gradient-based methods are shown to be effective to improve the adversarial transferability.
We propose an enhanced momentum iterative gradient-based method to further enhance the adversarial transferability.
arXiv Detail & Related papers (2021-03-19T03:10:32Z) - Random Transformation of Image Brightness for Adversarial Attack [5.405413975396116]
adversarial examples are crafted by adding small, human-imperceptibles to the original images.
Deep neural networks are vulnerable to adversarial examples, which are crafted by adding small, human-imperceptibles to the original images.
We propose an adversarial example generation method based on this phenomenon, which can be integrated with Fast Gradient Sign Method.
Our method has a higher success rate for black-box attacks than other attack methods based on data augmentation.
arXiv Detail & Related papers (2021-01-12T07:00:04Z) - Improving the Transferability of Adversarial Examples with the Adam
Optimizer [11.210560572849383]
This study combines an improved Adam gradient descent algorithm with the iterative gradient-based attack method.
Experiments on ImageNet showed that the proposed method offers a higher attack success rate than existing iterative methods.
Our best black-box attack achieved a success rate of 81.9% on a normally trained network and 38.7% on an adversarially trained network.
arXiv Detail & Related papers (2020-12-01T15:18:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.