Perturbation Towards Easy Samples Improves Targeted Adversarial Transferability
- URL: http://arxiv.org/abs/2406.05535v1
- Date: Sat, 8 Jun 2024 17:33:23 GMT
- Title: Perturbation Towards Easy Samples Improves Targeted Adversarial Transferability
- Authors: Junqi Gao, Biqing Qi, Yao Li, Zhichang Guo, Dong Li, Yuming Xing, Dazhi Zhang,
- Abstract summary: A generative targeted attack strategy named Easy Sample Matching Attack (ESMA) is proposed.
ESMA has a higher success rate for targeted attacks and outperforms the SOTA generative method.
- Score: 6.631279159122179
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The transferability of adversarial perturbations provides an effective shortcut for black-box attacks. Targeted perturbations have greater practicality but are more difficult to transfer between models. In this paper, we experimentally and theoretically demonstrated that neural networks trained on the same dataset have more consistent performance in High-Sample-Density-Regions (HSDR) of each class instead of low sample density regions. Therefore, in the target setting, adding perturbations towards HSDR of the target class is more effective in improving transferability. However, density estimation is challenging in high-dimensional scenarios. Further theoretical and experimental verification demonstrates that easy samples with low loss are more likely to be located in HSDR. Perturbations towards such easy samples in the target class can avoid density estimation for HSDR location. Based on the above facts, we verified that adding perturbations to easy samples in the target class improves targeted adversarial transferability of existing attack methods. A generative targeted attack strategy named Easy Sample Matching Attack (ESMA) is proposed, which has a higher success rate for targeted attacks and outperforms the SOTA generative method. Moreover, ESMA requires only 5% of the storage space and much less computation time comparing to the current SOTA, as ESMA attacks all classes with only one model instead of seperate models for each class. Our code is available at https://github.com/gjq100/ESMA.
Related papers
- Improving Transferable Targeted Adversarial Attack via Normalized Logit Calibration and Truncated Feature Mixing [26.159434438078968]
We propose two techniques for improving the targeted transferability from the loss and feature aspects.
In previous approaches, logit calibrations primarily focus on the logit margin between the targeted class and the untargeted classes among samples.
We introduce a new normalized logit calibration method that jointly considers the logit margin and the standard deviation of logits.
arXiv Detail & Related papers (2024-05-10T09:13:57Z) - Exploring Frequencies via Feature Mixing and Meta-Learning for Improving Adversarial Transferability [26.159434438078968]
We introduce a frequency decomposition-based feature mixing method to exploit frequency characteristics in both clean and adversarial samples.
Our findings suggest that incorporating features of clean samples into adversarial features extracted from adversarial examples is more effective in attacking normally-trained models.
We propose a cross-frequency meta-optimization approach comprising the meta-train step, meta-test step, and final update.
arXiv Detail & Related papers (2024-05-06T06:32:58Z) - SWAP: Exploiting Second-Ranked Logits for Adversarial Attacks on Time
Series [11.356275885051442]
Time series classification (TSC) has emerged as a critical task in various domains.
Deep neural models have shown superior performance in TSC tasks.
TSC models are vulnerable to adversarial attacks.
We propose SWAP, a novel attacking method for TSC models.
arXiv Detail & Related papers (2023-09-06T06:17:35Z) - Transferable Attack for Semantic Segmentation [59.17710830038692]
adversarial attacks, and observe that the adversarial examples generated from a source model fail to attack the target models.
We propose an ensemble attack for semantic segmentation to achieve more effective attacks with higher transferability.
arXiv Detail & Related papers (2023-07-31T11:05:55Z) - Logit Margin Matters: Improving Transferable Targeted Adversarial Attack
by Logit Calibration [85.71545080119026]
Cross-Entropy (CE) loss function is insufficient to learn transferable targeted adversarial examples.
We propose two simple and effective logit calibration methods, which are achieved by downscaling the logits with a temperature factor and an adaptive margin.
Experiments conducted on the ImageNet dataset validate the effectiveness of the proposed methods.
arXiv Detail & Related papers (2023-03-07T06:42:52Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Scale-Equivalent Distillation for Semi-Supervised Object Detection [57.59525453301374]
Recent Semi-Supervised Object Detection (SS-OD) methods are mainly based on self-training, generating hard pseudo-labels by a teacher model on unlabeled data as supervisory signals.
We analyze the challenges these methods meet with the empirical experiment results.
We introduce a novel approach, Scale-Equivalent Distillation (SED), which is a simple yet effective end-to-end knowledge distillation framework robust to large object size variance and class imbalance.
arXiv Detail & Related papers (2022-03-23T07:33:37Z) - Boosting Transferability of Targeted Adversarial Examples via
Hierarchical Generative Networks [56.96241557830253]
Transfer-based adversarial attacks can effectively evaluate model robustness in the black-box setting.
We propose a conditional generative attacking model, which can generate the adversarial examples targeted at different classes.
Our method improves the success rates of targeted black-box attacks by a significant margin over the existing methods.
arXiv Detail & Related papers (2021-07-05T06:17:47Z) - Self-Damaging Contrastive Learning [92.34124578823977]
Unlabeled data in reality is commonly imbalanced and shows a long-tail distribution.
This paper proposes a principled framework called Self-Damaging Contrastive Learning to automatically balance the representation learning without knowing the classes.
Our experiments show that SDCLR significantly improves not only overall accuracies but also balancedness.
arXiv Detail & Related papers (2021-06-06T00:04:49Z) - DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of
Ensembles [20.46399318111058]
Adversarial attacks can mislead CNN models with small perturbations, which can effectively transfer between different models trained on the same dataset.
We propose DVERGE, which isolates the adversarial vulnerability in each sub-model by distilling non-robust features.
The novel diversity metric and training procedure enables DVERGE to achieve higher robustness against transfer attacks.
arXiv Detail & Related papers (2020-09-30T14:57:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.