Related papers: On Transfer-based Universal Attacks in Pure Black-box Setting

On Transfer-based Universal Attacks in Pure Black-box Setting

URL: http://arxiv.org/abs/2504.08866v1
Date: Fri, 11 Apr 2025 10:41:20 GMT
Title: On Transfer-based Universal Attacks in Pure Black-box Setting
Authors: Mohammad A. A. K. Jalwana, Naveed Akhtar, Ajmal Mian, Nazanin Rahnavard, Mubarak Shah,
Abstract summary: We study the role of prior knowledge of the target model data and number of classes in attack performance.<n>We also provide several interesting insights based on our analysis, and demonstrate that priors cause overestimation in transferability scores.
Score: 94.92884394009288
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Despite their impressive performance, deep visual models are susceptible to transferable black-box adversarial attacks. Principally, these attacks craft perturbations in a target model-agnostic manner. However, surprisingly, we find that existing methods in this domain inadvertently take help from various priors that violate the black-box assumption such as the availability of the dataset used to train the target model, and the knowledge of the number of classes in the target model. Consequently, the literature fails to articulate the true potency of transferable black-box attacks. We provide an empirical study of these biases and propose a framework that aids in a prior-free transparent study of this paradigm. Using our framework, we analyze the role of prior knowledge of the target model data and number of classes in attack performance. We also provide several interesting insights based on our analysis, and demonstrate that priors cause overestimation in transferability scores. Finally, we extend our framework to query-based attacks. This extension inspires a novel image-blending technique to prepare data for effective surrogate model training.

Related papers

Membership Inference Attacks for Unseen Classes [34.7456021961386]
We show that attacks relying on shadow models can fail catastrophically on critical AI safety applications.<n>Although this problem seems intractable within the shadow model paradigm, we find that quantile regression attacks are a promising approach.
arXiv Detail & Related papers (2025-06-06T19:27:52Z)
Model Mimic Attack: Knowledge Distillation for Provably Transferable Adversarial Examples [1.1820990818670631]
This work is the first to provide provable guarantees on the success of knowledge distillation-based attack on classification neural networks. We prove that if the student model has enough learning capabilities, the attack on the teacher model is guaranteed to be found within the finite number of distillation iterations.
arXiv Detail & Related papers (2024-10-21T11:06:56Z)
Understanding the Robustness of Randomized Feature Defense Against Query-Based Adversarial Attacks [23.010308600769545]
Deep neural networks are vulnerable to adversarial examples that find samples close to the original image but can make the model misclassify. We propose a simple and lightweight defense against black-box attacks by adding random noise to hidden features at intermediate layers of the model at inference time. Our method effectively enhances the model's resilience against both score-based and decision-based black-box attacks.
arXiv Detail & Related papers (2023-10-01T03:53:23Z)
Query Efficient Cross-Dataset Transferable Black-Box Attack on Action Recognition [99.29804193431823]
Black-box adversarial attacks present a realistic threat to action recognition systems. We propose a new attack on action recognition that addresses these shortcomings by generating perturbations. Our method achieves 8% and higher 12% deception rates compared to state-of-the-art query-based and transfer-based attacks.
arXiv Detail & Related papers (2022-11-23T17:47:49Z)
Query-Efficient Black-box Adversarial Attacks Guided by a Transfer-based Prior [50.393092185611536]
We consider the black-box adversarial setting, where the adversary needs to craft adversarial examples without access to the gradients of a target model. Previous methods attempted to approximate the true gradient either by using the transfer gradient of a surrogate white-box model or based on the feedback of model queries. We propose two prior-guided random gradient-free (PRGF) algorithms based on biased sampling and gradient averaging.
arXiv Detail & Related papers (2022-03-13T04:06:27Z)
Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process. The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z)
Two Sides of the Same Coin: White-box and Black-box Attacks for Transfer Learning [60.784641458579124]
We show that fine-tuning effectively enhances model robustness under white-box FGSM attacks. We also propose a black-box attack method for transfer learning models which attacks the target model with the adversarial examples produced by its source model. To systematically measure the effect of both white-box and black-box attacks, we propose a new metric to evaluate how transferable are the adversarial examples produced by a source model to a target model.
arXiv Detail & Related papers (2020-08-25T15:04:32Z)
Boosting Black-Box Attack with Partially Transferred Conditional Adversarial Distribution [83.02632136860976]
We study black-box adversarial attacks against deep neural networks (DNNs) We develop a novel mechanism of adversarial transferability, which is robust to the surrogate biases. Experiments on benchmark datasets and attacking against real-world API demonstrate the superior attack performance of the proposed method.
arXiv Detail & Related papers (2020-06-15T16:45:27Z)
Luring of transferable adversarial perturbations in the black-box paradigm [0.0]
We present a new approach to improve the robustness of a model against black-box transfer attacks. A removable additional neural network is included in the target model, and is designed to induce the textitluring effect. Our deception-based method only needs to have access to the predictions of the target model and does not require a labeled data set.
arXiv Detail & Related papers (2020-04-10T06:48:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.