Who's Afraid of Adversarial Transferability?
- URL: http://arxiv.org/abs/2105.00433v2
- Date: Wed, 5 May 2021 11:16:59 GMT
- Title: Who's Afraid of Adversarial Transferability?
- Authors: Ziv Katzir, Yuval Elovici
- Abstract summary: Adversarial transferability has long been the "big bad wolf" of adversarial machine learning.
We show that it is practically impossible to predict whether a given adversarial example is transferable to a specific target model in a black-box setting.
- Score: 43.80151929320557
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial transferability, namely the ability of adversarial perturbations
to simultaneously fool multiple learning models, has long been the "big bad
wolf" of adversarial machine learning. Successful transferability-based attacks
requiring no prior knowledge of the attacked model's parameters or training
data have been demonstrated numerous times in the past, implying that machine
learning models pose an inherent security threat to real-life systems. However,
all of the research performed in this area regarded transferability as a
probabilistic property and attempted to estimate the percentage of adversarial
examples that are likely to mislead a target model given some predefined
evaluation set. As a result, those studies ignored the fact that real-life
adversaries are often highly sensitive to the cost of a failed attack. We argue
that overlooking this sensitivity has led to an exaggerated perception of the
transferability threat, when in fact real-life transferability-based attacks
are quite unlikely. By combining theoretical reasoning with a series of
empirical results, we show that it is practically impossible to predict whether
a given adversarial example is transferable to a specific target model in a
black-box setting, hence questioning the validity of adversarial
transferability as a real-life attack tool for adversaries that are sensitive
to the cost of a failed attack.
Related papers
- Black-box Adversarial Transferability: An Empirical Study in Cybersecurity Perspective [0.0]
In adversarial machine learning, malicious users try to fool the deep learning model by inserting adversarial perturbation inputs into the model during its training or testing phase.
We empirically test the black-box adversarial transferability phenomena in cyber attack detection systems.
The results indicate that any deep learning model is highly susceptible to adversarial attacks, even if the attacker does not have access to the internal details of the target model.
arXiv Detail & Related papers (2024-04-15T06:56:28Z) - Your Attack Is Too DUMB: Formalizing Attacker Scenarios for Adversarial
Transferability [17.899587145780817]
Evasion attacks are a threat to machine learning models, where adversaries attempt to affect classifiers by injecting malicious samples.
We propose the DUMB attacker model, which allows analyzing if evasion attacks fail to transfer when the training conditions of surrogate and victim models differ.
Our analysis, which generated 13K tests over 14 distinct attacks, led to numerous novel findings in the scope of transferable attacks with surrogate models.
arXiv Detail & Related papers (2023-06-27T10:21:27Z) - Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack [53.032801921915436]
Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars.
Recently, the robustness of skeleton-based HAR methods have been questioned due to their vulnerability to adversarial attacks.
We show such threats exist, even when the attacker only has access to the input/output of the model.
We propose the very first black-box adversarial attack approach in skeleton-based HAR called BASAR.
arXiv Detail & Related papers (2022-11-21T09:51:28Z) - Robust Transferable Feature Extractors: Learning to Defend Pre-Trained
Networks Against White Box Adversaries [69.53730499849023]
We show that adversarial examples can be successfully transferred to another independently trained model to induce prediction errors.
We propose a deep learning-based pre-processing mechanism, which we refer to as a robust transferable feature extractor (RTFE)
arXiv Detail & Related papers (2022-09-14T21:09:34Z) - Resisting Deep Learning Models Against Adversarial Attack
Transferability via Feature Randomization [17.756085566366167]
We propose a feature randomization-based approach that resists eight adversarial attacks targeting deep learning models.
Our methodology can secure the target network and resists adversarial attack transferability by over 60%.
arXiv Detail & Related papers (2022-09-11T20:14:12Z) - Rethinking Textual Adversarial Defense for Pre-trained Language Models [79.18455635071817]
A literature review shows that pre-trained language models (PrLMs) are vulnerable to adversarial attacks.
We propose a novel metric (Degree of Anomaly) to enable current adversarial attack approaches to generate more natural and imperceptible adversarial examples.
We show that our universal defense framework achieves comparable or even higher after-attack accuracy with other specific defenses.
arXiv Detail & Related papers (2022-07-21T07:51:45Z) - Learning to Learn Transferable Attack [77.67399621530052]
Transfer adversarial attack is a non-trivial black-box adversarial attack that aims to craft adversarial perturbations on the surrogate model and then apply such perturbations to the victim model.
We propose a Learning to Learn Transferable Attack (LLTA) method, which makes the adversarial perturbations more generalized via learning from both data and model augmentation.
Empirical results on the widely-used dataset demonstrate the effectiveness of our attack method with a 12.85% higher success rate of transfer attack compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-10T07:24:21Z) - Adversarial Transfer Attacks With Unknown Data and Class Overlap [19.901933940805684]
Current transfer attack research has an unrealistic advantage for the attacker.
We present the first study of transferring adversarial attacks focusing on the data available to attacker and victim under imperfect settings.
This threat model is relevant to applications in medicine, malware, and others.
arXiv Detail & Related papers (2021-09-23T03:41:34Z) - Localized Uncertainty Attacks [9.36341602283533]
We present localized uncertainty attacks against deep learning models.
We create adversarial examples by perturbing only regions in the inputs where a classifier is uncertain.
Unlike $ell_p$ ball or functional attacks which perturb inputs indiscriminately, our targeted changes can be less perceptible.
arXiv Detail & Related papers (2021-06-17T03:07:22Z) - Learning to Attack: Towards Textual Adversarial Attacking in Real-world
Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples.
We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.