Investigating Top-$k$ White-Box and Transferable Black-box Attack
- URL: http://arxiv.org/abs/2204.00089v1
- Date: Wed, 30 Mar 2022 15:02:27 GMT
- Title: Investigating Top-$k$ White-Box and Transferable Black-box Attack
- Authors: Chaoning Zhang, Philipp Benz, Adil Karjauv, Jae Won Cho, Kang Zhang,
In So Kweon
- Abstract summary: We show that stronger attack actually transfers better for the general top-$k$ ASR indicated by the interest class rank (ICR) after attack.
We propose a new normalized CE loss that guides the logit to be updated in the direction of implicitly maximizing its rank distance from the ground-truth class.
- Score: 75.13902066331356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing works have identified the limitation of top-$1$ attack success rate
(ASR) as a metric to evaluate the attack strength but exclusively investigated
it in the white-box setting, while our work extends it to a more practical
black-box setting: transferable attack. It is widely reported that stronger
I-FGSM transfers worse than simple FGSM, leading to a popular belief that
transferability is at odds with the white-box attack strength. Our work
challenges this belief with empirical finding that stronger attack actually
transfers better for the general top-$k$ ASR indicated by the interest class
rank (ICR) after attack. For increasing the attack strength, with an intuitive
interpretation of the logit gradient from the geometric perspective, we
identify that the weakness of the commonly used losses lie in prioritizing the
speed to fool the network instead of maximizing its strength. To this end, we
propose a new normalized CE loss that guides the logit to be updated in the
direction of implicitly maximizing its rank distance from the ground-truth
class. Extensive results in various settings have verified that our proposed
new loss is simple yet effective for top-$k$ attack. Code is available at:
\url{https://bit.ly/3uCiomP}
Related papers
- Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation [49.480978190805125]
Transfer attacks generate significant interest for black-box applications.
Existing works essentially directly optimize the single-level objective w.r.t. surrogate model.
We propose a bilevel optimization paradigm, which explicitly reforms the nested relationship between the Upper-Level (UL) pseudo-victim attacker and the Lower-Level (LL) surrogate attacker.
arXiv Detail & Related papers (2024-06-04T07:45:27Z) - Universal Black-Box Reward Poisoning Attack against Offline Reinforcement Learning [4.629358641630161]
We study the problem of universal black-boxed reward poisoning attacks against general offline reinforcement learning with deep neural networks.
We propose the first universal black-box reward poisoning attack in the general offline RL setting.
arXiv Detail & Related papers (2024-02-15T04:08:49Z) - Improving transferability of 3D adversarial attacks with scale and shear
transformations [34.07511992559102]
This paper proposes Scale and Shear (SS) Attack to generate 3D adversarial examples with strong transferability.
Specifically, we randomly scale or shear the input point cloud, so that the attack will not overfit the white-box model.
Experiments show that the SS attack can be seamlessly combined with the existing state-of-the-art (SOTA) 3D point cloud attack methods.
arXiv Detail & Related papers (2022-11-02T13:09:38Z) - Transfer Attacks Revisited: A Large-Scale Empirical Study in Real
Computer Vision Settings [64.37621685052571]
We conduct the first systematic empirical study of transfer attacks against major cloud-based ML platforms.
The study leads to a number of interesting findings which are inconsistent to the existing ones.
We believe this work sheds light on the vulnerabilities of popular ML platforms and points to a few promising research directions.
arXiv Detail & Related papers (2022-04-07T12:16:24Z) - Constrained Gradient Descent: A Powerful and Principled Evasion Attack
Against Neural Networks [19.443306494201334]
We introduce several innovations that make white-box targeted attacks follow the intuition of the attacker's goal.
First, we propose a new loss function that explicitly captures the goal of targeted attacks.
Second, we propose a new attack method that uses a further developed version of our loss function capturing both the misclassification objective and the $L_infty$ distance limit.
arXiv Detail & Related papers (2021-12-28T17:36:58Z) - Adversarial Attacks on Gaussian Process Bandits [47.84198626686564]
We propose various adversarial attack methods with differing assumptions on the attacker's strength and prior information.
Our goal is to understand adversarial attacks on GP bandits from both a theoretical and practical perspective.
We demonstrate that adversarial attacks on GP bandits can succeed in forcing the algorithm towards $mathcalR_rm target$ even with a low attack budget.
arXiv Detail & Related papers (2021-10-16T02:39:10Z) - Meta Gradient Adversarial Attack [64.5070788261061]
This paper proposes a novel architecture called Metaversa Gradient Adrial Attack (MGAA), which is plug-and-play and can be integrated with any existing gradient-based attack method.
Specifically, we randomly sample multiple models from a model zoo to compose different tasks and iteratively simulate a white-box attack and a black-box attack in each task.
By narrowing the gap between the gradient directions in white-box and black-box attacks, the transferability of adversarial examples on the black-box setting can be improved.
arXiv Detail & Related papers (2021-08-09T17:44:19Z) - Switching Transferable Gradient Directions for Query-Efficient Black-Box
Adversarial Attacks [38.91061792696202]
We propose a simple and highly query-efficient black-box adversarial attack named SWITCH.
SWITCH features a highly efficient and effective utilization of the gradient of a surrogate model.
Experimental results conducted on CIFAR-10, CIFAR-100 and TinyImageNet show that SWITCH achieves a satisfactory attack success rate.
arXiv Detail & Related papers (2020-09-15T15:55:08Z) - RayS: A Ray Searching Method for Hard-label Adversarial Attack [99.72117609513589]
We present the Ray Searching attack (RayS), which greatly improves the hard-label attack effectiveness as well as efficiency.
RayS attack can also be used as a sanity check for possible "falsely robust" models.
arXiv Detail & Related papers (2020-06-23T07:01:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.