Fuzziness-tuned: Improving the Transferability of Adversarial Examples
- URL: http://arxiv.org/abs/2303.10078v1
- Date: Fri, 17 Mar 2023 16:00:18 GMT
- Title: Fuzziness-tuned: Improving the Transferability of Adversarial Examples
- Authors: Xiangyuan Yang, Jie Lin, Hanlin Zhang, Xinyu Yang, Peng Zhao
- Abstract summary: adversarial examples have been widely used to enhance the robustness of the training models on deep neural networks.
The attack success rate of the transfer-based attacks on the surrogate model is much higher than that on victim model under the low attack strength.
A fuzziness-tuned method is proposed to ensure the generated adversarial examples can effectively skip out of the fuzzy domain.
- Score: 18.880398046794138
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: With the development of adversarial attacks, adversairal examples have been
widely used to enhance the robustness of the training models on deep neural
networks. Although considerable efforts of adversarial attacks on improving the
transferability of adversarial examples have been developed, the attack success
rate of the transfer-based attacks on the surrogate model is much higher than
that on victim model under the low attack strength (e.g., the attack strength
$\epsilon=8/255$). In this paper, we first systematically investigated this
issue and found that the enormous difference of attack success rates between
the surrogate model and victim model is caused by the existence of a special
area (known as fuzzy domain in our paper), in which the adversarial examples in
the area are classified wrongly by the surrogate model while correctly by the
victim model. Then, to eliminate such enormous difference of attack success
rates for improving the transferability of generated adversarial examples, a
fuzziness-tuned method consisting of confidence scaling mechanism and
temperature scaling mechanism is proposed to ensure the generated adversarial
examples can effectively skip out of the fuzzy domain. The confidence scaling
mechanism and the temperature scaling mechanism can collaboratively tune the
fuzziness of the generated adversarial examples through adjusting the gradient
descent weight of fuzziness and stabilizing the update direction, respectively.
Specifically, the proposed fuzziness-tuned method can be effectively integrated
with existing adversarial attacks to further improve the transferability of
adverarial examples without changing the time complexity. Extensive experiments
demonstrated that fuzziness-tuned method can effectively enhance the
transferability of adversarial examples in the latest transfer-based attacks.
Related papers
- Boosting the Targeted Transferability of Adversarial Examples via Salient Region & Weighted Feature Drop [2.176586063731861]
A prevalent approach for adversarial attacks relies on the transferability of adversarial examples.
A novel framework based on Salient region & Weighted Feature Drop (SWFD) designed to enhance the targeted transferability of adversarial examples.
arXiv Detail & Related papers (2024-11-11T08:23:37Z) - Transferable Adversarial Attacks on SAM and Its Downstream Models [87.23908485521439]
This paper explores the feasibility of adversarial attacking various downstream models fine-tuned from the segment anything model (SAM)
To enhance the effectiveness of the adversarial attack towards models fine-tuned on unknown datasets, we propose a universal meta-initialization (UMI) algorithm.
arXiv Detail & Related papers (2024-10-26T15:04:04Z) - Improving Adversarial Transferability by Stable Diffusion [36.97548018603747]
adversarial examples introduce imperceptible perturbations to benign samples, deceiving predictions.
Deep neural networks (DNNs) are susceptible to adversarial examples, which introduce imperceptible perturbations to benign samples, deceiving predictions.
We introduce a novel attack method called Stable Diffusion Attack Method (SDAM), which incorporates samples generated by Stable Diffusion to augment input images.
arXiv Detail & Related papers (2023-11-18T09:10:07Z) - LFAA: Crafting Transferable Targeted Adversarial Examples with
Low-Frequency Perturbations [25.929492841042666]
We present a novel approach to generate transferable targeted adversarial examples.
We exploit the vulnerability of deep neural networks to perturbations on high-frequency components of images.
Our proposed approach significantly outperforms state-of-the-art methods.
arXiv Detail & Related papers (2023-10-31T04:54:55Z) - Generating Adversarial Examples with Better Transferability via Masking
Unimportant Parameters of Surrogate Model [6.737574282249396]
We propose to improve the transferability of adversarial examples in the transfer-based attack via unimportant masking parameters (MUP)
The key idea in MUP is to refine the pretrained surrogate models to boost the transfer-based attack.
arXiv Detail & Related papers (2023-04-14T03:06:43Z) - Improving the Transferability of Adversarial Examples via Direction
Tuning [18.880398046794138]
In the transfer-based adversarial attacks, adversarial examples are only generated by the surrogate models and achieve effective perturbation in the victim models.
A novel transfer-based attack, namely direction tuning attack, is proposed to decrease the update deviation in the large step length.
In addition, a network pruning method is proposed to smooth the decision boundary, thereby further decreasing the update oscillation and enhancing the transferability of the generated adversarial examples.
arXiv Detail & Related papers (2023-03-27T11:26:34Z) - Making Substitute Models More Bayesian Can Enhance Transferability of
Adversarial Examples [89.85593878754571]
transferability of adversarial examples across deep neural networks is the crux of many black-box attacks.
We advocate to attack a Bayesian model for achieving desirable transferability.
Our method outperforms recent state-of-the-arts by large margins.
arXiv Detail & Related papers (2023-02-10T07:08:13Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training [106.34722726264522]
A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise.
Pre-processing methods may suffer from the robustness degradation effect.
A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model.
We propose a method called Joint Adversarial Training based Pre-processing (JATP) defense.
arXiv Detail & Related papers (2021-06-10T01:45:32Z) - Boosting Black-Box Attack with Partially Transferred Conditional
Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs)
We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases.
Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.