Logit Margin Matters: Improving Transferable Targeted Adversarial Attack
by Logit Calibration
- URL: http://arxiv.org/abs/2303.03680v1
- Date: Tue, 7 Mar 2023 06:42:52 GMT
- Title: Logit Margin Matters: Improving Transferable Targeted Adversarial Attack
by Logit Calibration
- Authors: Juanjuan Weng, Zhiming Luo, Zhun Zhong, Shaozi Li, Nicu Sebe
- Abstract summary: Cross-Entropy (CE) loss function is insufficient to learn transferable targeted adversarial examples.
We propose two simple and effective logit calibration methods, which are achieved by downscaling the logits with a temperature factor and an adaptive margin.
Experiments conducted on the ImageNet dataset validate the effectiveness of the proposed methods.
- Score: 85.71545080119026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous works have extensively studied the transferability of adversarial
samples in untargeted black-box scenarios. However, it still remains
challenging to craft targeted adversarial examples with higher transferability
than non-targeted ones. Recent studies reveal that the traditional
Cross-Entropy (CE) loss function is insufficient to learn transferable targeted
adversarial examples due to the issue of vanishing gradient. In this work, we
provide a comprehensive investigation of the CE loss function and find that the
logit margin between the targeted and untargeted classes will quickly obtain
saturation in CE, which largely limits the transferability. Therefore, in this
paper, we devote to the goal of continually increasing the logit margin along
the optimization to deal with the saturation issue and propose two simple and
effective logit calibration methods, which are achieved by downscaling the
logits with a temperature factor and an adaptive margin, respectively. Both of
them can effectively encourage optimization to produce a larger logit margin
and lead to higher transferability. Besides, we show that minimizing the cosine
distance between the adversarial examples and the classifier weights of the
target class can further improve the transferability, which is benefited from
downscaling logits via L2-normalization. Experiments conducted on the ImageNet
dataset validate the effectiveness of the proposed methods, which outperform
the state-of-the-art methods in black-box targeted attacks. The source code is
available at \href{https://github.com/WJJLL/Target-Attack/}{Link}
Related papers
- Perturbation Towards Easy Samples Improves Targeted Adversarial Transferability [6.631279159122179]
A generative targeted attack strategy named Easy Sample Matching Attack (ESMA) is proposed.
ESMA has a higher success rate for targeted attacks and outperforms the SOTA generative method.
arXiv Detail & Related papers (2024-06-08T17:33:23Z) - Improving Transferable Targeted Adversarial Attack via Normalized Logit Calibration and Truncated Feature Mixing [26.159434438078968]
We propose two techniques for improving the targeted transferability from the loss and feature aspects.
In previous approaches, logit calibrations primarily focus on the logit margin between the targeted class and the untargeted classes among samples.
We introduce a new normalized logit calibration method that jointly considers the logit margin and the standard deviation of logits.
arXiv Detail & Related papers (2024-05-10T09:13:57Z) - Query-Efficient Black-box Adversarial Attacks Guided by a Transfer-based
Prior [50.393092185611536]
We consider the black-box adversarial setting, where the adversary needs to craft adversarial examples without access to the gradients of a target model.
Previous methods attempted to approximate the true gradient either by using the transfer gradient of a surrogate white-box model or based on the feedback of model queries.
We propose two prior-guided random gradient-free (PRGF) algorithms based on biased sampling and gradient averaging.
arXiv Detail & Related papers (2022-03-13T04:06:27Z) - Towards Transferable Unrestricted Adversarial Examples with Minimum
Changes [13.75751221823941]
Transfer-based adversarial example is one of the most important classes of black-box attacks.
There is a trade-off between transferability and imperceptibility of the adversarial perturbation.
We propose a geometry-aware framework to generate transferable adversarial examples with minimum changes.
arXiv Detail & Related papers (2022-01-04T12:03:20Z) - Adaptive Perturbation for Adversarial Attack [50.77612889697216]
We propose a new gradient-based attack method for adversarial examples.
We use the exact gradient direction with a scaling factor for generating adversarial perturbations.
Our method exhibits higher transferability and outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-11-27T07:57:41Z) - Boosting Transferability of Targeted Adversarial Examples via
Hierarchical Generative Networks [56.96241557830253]
Transfer-based adversarial attacks can effectively evaluate model robustness in the black-box setting.
We propose a conditional generative attacking model, which can generate the adversarial examples targeted at different classes.
Our method improves the success rates of targeted black-box attacks by a significant margin over the existing methods.
arXiv Detail & Related papers (2021-07-05T06:17:47Z) - On Generating Transferable Targeted Perturbations [102.3506210331038]
We propose a new generative approach for highly transferable targeted perturbations.
Our approach matches the perturbed image distribution' with that of the target class, leading to high targeted transferability rates.
arXiv Detail & Related papers (2021-03-26T17:55:28Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Continuous Transfer Learning with Label-informed Distribution Alignment [42.34180707803632]
We study a novel continuous transfer learning setting with a time evolving target domain.
One major challenge associated with continuous transfer learning is the potential occurrence of negative transfer.
We propose a generic adversarial Variational Auto-encoder framework named TransLATE.
arXiv Detail & Related papers (2020-06-05T04:44:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.