Devling into Adversarial Transferability on Image Classification: Review, Benchmark, and Evaluation
- URL: http://arxiv.org/abs/2602.23117v1
- Date: Thu, 26 Feb 2026 15:30:36 GMT
- Title: Devling into Adversarial Transferability on Image Classification: Review, Benchmark, and Evaluation
- Authors: Xiaosen Wang, Zhijin Ge, Bohan Liu, Zheng Fang, Fengfan Zhou, Ruixuan Zhang, Shaokang Wang, Yuyang Luo,
- Abstract summary: Adversa transferability refers to the capacity of adversarial examples generated on the surrogate model to deceive alternate, unexposed victim models.<n>In this work, we discern a lack of a standardized framework and criteria for evaluating transfer-based attacks.<n>We propose a comprehensive framework designed to serve as a benchmark for evaluating these attacks.
- Score: 12.423783318201778
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial transferability refers to the capacity of adversarial examples generated on the surrogate model to deceive alternate, unexposed victim models. This property eliminates the need for direct access to the victim model during an attack, thereby raising considerable security concerns in practical applications and attracting substantial research attention recently. In this work, we discern a lack of a standardized framework and criteria for evaluating transfer-based attacks, leading to potentially biased assessments of existing approaches. To rectify this gap, we have conducted an exhaustive review of hundreds of related works, organizing various transfer-based attacks into six distinct categories. Subsequently, we propose a comprehensive framework designed to serve as a benchmark for evaluating these attacks. In addition, we delineate common strategies that enhance adversarial transferability and highlight prevalent issues that could lead to unfair comparisons. Finally, we provide a brief review of transfer-based attacks beyond image classification.
Related papers
- Quantifying the Risk of Transferred Black Box Attacks [0.0]
Neural networks have become pervasive across various applications, including security-related products.<n>This paper investigates the complexities involved in resilience testing against transferred adversarial attacks.<n>We propose a targeted resilience testing framework that employs surrogate models strategically selected based on Centered Kernel Alignment (CKA) similarity.
arXiv Detail & Related papers (2025-11-07T09:34:43Z) - On Transfer-based Universal Attacks in Pure Black-box Setting [94.92884394009288]
We study the role of prior knowledge of the target model data and number of classes in attack performance.<n>We also provide several interesting insights based on our analysis, and demonstrate that priors cause overestimation in transferability scores.
arXiv Detail & Related papers (2025-04-11T10:41:20Z) - Benchmarking Transferable Adversarial Attacks [6.898135768312255]
The robustness of deep learning models against adversarial attacks remains a pivotal concern.
This study systematically categorizes and critically evaluates various methodologies developed to augment the transferability of adversarial attacks.
arXiv Detail & Related papers (2024-02-01T08:36:16Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - Towards Evaluating Transfer-based Attacks Systematically, Practically,
and Fairly [79.07074710460012]
adversarial vulnerability of deep neural networks (DNNs) has drawn great attention.
An increasing number of transfer-based methods have been developed to fool black-box DNN models.
We establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods.
arXiv Detail & Related papers (2023-11-02T15:35:58Z) - Revisiting Transferable Adversarial Images: Systemization, Evaluation, and New Insights [33.09769747941402]
Transferable adversarial images raise critical security concerns for computer vision systems in real-world, black-box attack scenarios.<n>In this paper, we systemize transfer attacks into five categories around the general machine learning pipeline.<n>We provide the first comprehensive evaluation, with 23 representative attacks against 11 representative defenses.
arXiv Detail & Related papers (2023-10-18T10:06:42Z) - Towards Good Practices in Evaluating Transfer Adversarial Attacks [23.40245805066479]
We present the first comprehensive evaluation of transfer attacks, covering 23 representative attacks against 9 defenses on ImageNet.
In particular, we propose to categorize existing attacks into five categories, which enables our systematic category-wise analyses.
We also pay particular attention to stealthiness, by adopting diverse imperceptibility metrics and looking into new, finer-grained characteristics.
arXiv Detail & Related papers (2022-11-17T14:40:31Z) - Towards Fair Classification against Poisoning Attacks [52.57443558122475]
We study the poisoning scenario where the attacker can insert a small fraction of samples into training data.
We propose a general and theoretically guaranteed framework which accommodates traditional defense methods to fair classification against poisoning attacks.
arXiv Detail & Related papers (2022-10-18T00:49:58Z) - Transferability Ranking of Adversarial Examples [20.41013432717447]
This paper introduces a ranking strategy that refines the transfer attack process.
By leveraging a set of diverse surrogate models, our method can predict transferability of adversarial examples.
Using our strategy, we were able to raise the transferability of adversarial examples from a mere 20% - akin to random selection-up to near upper-bound levels.
arXiv Detail & Related papers (2022-08-23T11:25:16Z) - An Intermediate-level Attack Framework on The Basis of Linear Regression [89.85593878754571]
This paper substantially extends our work published at ECCV, in which an intermediate-level attack was proposed to improve the transferability of some baseline adversarial examples.
We advocate to establish a direct linear mapping from the intermediate-level discrepancies (between adversarial features and benign features) to classification prediction loss of the adversarial example.
We show that 1) a variety of linear regression models can all be considered in order to establish the mapping, 2) the magnitude of the finally obtained intermediate-level discrepancy is linearly correlated with adversarial transferability, and 3) further boost of the performance can be achieved by performing multiple runs of the baseline attack with
arXiv Detail & Related papers (2022-03-21T03:54:53Z) - Attack Transferability Characterization for Adversarially Robust
Multi-label Classification [37.00606062677375]
This study focuses on non-targeted evasion attack against multi-label classifiers.
The goal of the threat is to cause miss-classification with respect to as many labels as possible.
We unveil how the transferability level of the attack determines the attackability of the classifier.
arXiv Detail & Related papers (2021-06-29T12:50:20Z) - Towards Robust Fine-grained Recognition by Maximal Separation of
Discriminative Features [72.72840552588134]
We identify the proximity of the latent representations of different classes in fine-grained recognition networks as a key factor to the success of adversarial attacks.
We introduce an attention-based regularization mechanism that maximally separates the discriminative latent features of different classes.
arXiv Detail & Related papers (2020-06-10T18:34:45Z) - Protecting Classifiers From Attacks [0.41942958779358663]
In multiple domains such as malware detection, automated driving systems, or fraud detection, classification algorithms are susceptible to being attacked.<n>We present an alternative Bayesian decision theoretic framework that accounts for the uncertainty about the attacker's behavior.<n>Globally, we are able to robustify statistical classification algorithms against malicious attacks.
arXiv Detail & Related papers (2020-04-18T21:21:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.