On Success and Simplicity: A Second Look at Transferable Targeted
Attacks
- URL: http://arxiv.org/abs/2012.11207v2
- Date: Sat, 6 Feb 2021 15:18:35 GMT
- Title: On Success and Simplicity: A Second Look at Transferable Targeted
Attacks
- Authors: Zhengyu Zhao, Zhuoran Liu, Martha Larson
- Abstract summary: We show that transferable targeted attacks converge slowly to optimal transferability and improve considerably when given more iterations.
An attack that simply maximizes the target logit performs surprisingly well, surpassing more complex losses and even achieving performance comparable to the state of the art.
- Score: 6.276791657895803
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: There is broad consensus among researchers studying adversarial examples that
it is extremely difficult to achieve transferable targeted attacks. Currently,
existing research strives for transferable targeted attacks by resorting to
complex losses and even massive training. In this paper, we take a second look
at transferable targeted attacks and show that their difficulty has been
overestimated due to a blind spot in the conventional evaluation procedures.
Specifically, current work has unreasonably restricted attack optimization to a
few iterations. Here, we show that targeted attacks converge slowly to optimal
transferability and improve considerably when given more iterations. We also
demonstrate that an attack that simply maximizes the target logit performs
surprisingly well, remarkably surpassing more complex losses and even achieving
performance comparable to the state of the art, which requires massive training
with a sophisticated multi-term loss. We provide further validation of our
logit attack in a realistic ensemble setting and in a real-world attack against
the Google Cloud Vision API. The logit attack produces perturbations that
reflect the target semantics, which we demonstrate allows us to create targeted
universal adversarial perturbations without additional training images.
Related papers
- Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations.
We transform the multi-granular attack into a sequential decision-making process.
Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - Guidance Through Surrogate: Towards a Generic Diagnostic Attack [101.36906370355435]
We develop a guided mechanism to avoid local minima during attack optimization, leading to a novel attack dubbed Guided Projected Gradient Attack (G-PGA)
Our modified attack does not require random restarts, large number of attack iterations or search for an optimal step-size.
More than an effective attack, G-PGA can be used as a diagnostic tool to reveal elusive robustness due to gradient masking in adversarial defenses.
arXiv Detail & Related papers (2022-12-30T18:45:23Z) - Improving Adversarial Robustness with Self-Paced Hard-Class Pair
Reweighting [5.084323778393556]
adversarial training with untargeted attacks is one of the most recognized methods.
We find that the naturally imbalanced inter-class semantic similarity makes those hard-class pairs to become the virtual targets of each other.
We propose to upweight hard-class pair loss in model optimization, which prompts learning discriminative features from hard classes.
arXiv Detail & Related papers (2022-10-26T22:51:36Z) - Reward Delay Attacks on Deep Reinforcement Learning [26.563537078924835]
We present novel attacks targeting Q-learning that exploit a vulnerability entailed by this assumption.
We consider two types of attack goals: targeted attacks, which aim to cause a target policy to be learned, and untargeted attacks, which simply aim to induce a policy with a low reward.
arXiv Detail & Related papers (2022-09-08T02:40:44Z) - Understanding Adversarial Attacks on Observations in Deep Reinforcement
Learning [32.12283927682007]
Deep reinforcement learning models are vulnerable to adversarial attacks which can decrease the victim's total reward by manipulating the observations.
We reformulate the problem of adversarial attacks in function space and separate the previous gradient based attacks into several subspaces.
In the first stage, we train a deceptive policy by hacking the environment, and discover a set of trajectories routing to the lowest reward.
Our method provides a tighter theoretical upper bound for the attacked agent's performance than the existing approaches.
arXiv Detail & Related papers (2021-06-30T07:41:51Z) - Guided Adversarial Attack for Evaluating and Enhancing Adversarial
Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training.
We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries.
We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z) - Learning to Attack with Fewer Pixels: A Probabilistic Post-hoc Framework
for Refining Arbitrary Dense Adversarial Attacks [21.349059923635515]
adversarial evasion attacks are reported to be susceptible to deep neural network image classifiers.
We propose a probabilistic post-hoc framework that refines given dense attacks by significantly reducing the number of perturbed pixels.
Our framework performs adversarial attacks much faster than existing sparse attacks.
arXiv Detail & Related papers (2020-10-13T02:51:10Z) - Robust Tracking against Adversarial Attacks [69.59717023941126]
We first attempt to generate adversarial examples on top of video sequences to improve the tracking robustness against adversarial attacks.
We apply the proposed adversarial attack and defense approaches to state-of-the-art deep tracking algorithms.
arXiv Detail & Related papers (2020-07-20T08:05:55Z) - Deflecting Adversarial Attacks [94.85315681223702]
We present a new approach towards ending this cycle where we "deflect" adversarial attacks by causing the attacker to produce an input that resembles the attack's target class.
We first propose a stronger defense based on Capsule Networks that combines three detection mechanisms to achieve state-of-the-art detection performance.
arXiv Detail & Related papers (2020-02-18T06:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.