Related papers: Towards Good Practices in Evaluating Transfer Adversarial Attacks

Towards Good Practices in Evaluating Transfer Adversarial Attacks

URL: http://arxiv.org/abs/2211.09565v3
Date: Sat, 28 Oct 2023 04:26:57 GMT
Title: Towards Good Practices in Evaluating Transfer Adversarial Attacks
Authors: Zhengyu Zhao, Hanwei Zhang, Renjue Li, Ronan Sicre, Laurent Amsaleg, Michael Backes
Abstract summary: We present the first comprehensive evaluation of transfer attacks, covering 23 representative attacks against 9 defenses on ImageNet. In particular, we propose to categorize existing attacks into five categories, which enables our systematic category-wise analyses. We also pay particular attention to stealthiness, by adopting diverse imperceptibility metrics and looking into new, finer-grained characteristics.
Score: 23.40245805066479
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Transfer adversarial attacks raise critical security concerns in real-world, black-box scenarios. However, the actual progress of this field is difficult to assess due to two common limitations in existing evaluations. First, different methods are often not systematically and fairly evaluated in a one-to-one comparison. Second, only transferability is evaluated but another key attack property, stealthiness, is largely overlooked. In this work, we design good practices to address these limitations, and we present the first comprehensive evaluation of transfer attacks, covering 23 representative attacks against 9 defenses on ImageNet. In particular, we propose to categorize existing attacks into five categories, which enables our systematic category-wise analyses. These analyses lead to new findings that even challenge existing knowledge and also help determine the optimal attack hyperparameters for our attack-wise comprehensive evaluation. We also pay particular attention to stealthiness, by adopting diverse imperceptibility metrics and looking into new, finer-grained characteristics. Overall, our new insights into transferability and stealthiness lead to actionable good practices for future evaluations.

Related papers

A Critical Evaluation of Defenses against Prompt Injection Attacks [95.81023801370073]
Large Language Models (LLMs) are vulnerable to prompt injection attacks.<n>Several defenses have recently been proposed, often claiming to mitigate these attacks successfully.<n>We argue that existing studies lack a principled approach to evaluating these defenses.
arXiv Detail & Related papers (2025-05-23T19:39:56Z)
SecReEvalBench: A Multi-turned Security Resilience Evaluation Benchmark for Large Language Models [4.788427041690547]
We present SecReEvalBench, the Security Resilience Evaluation Benchmark.<n>It defines four novel metrics: Prompt Attack Resilience Score, Prompt Attack Refusal Logic Score, Chain-Based Attack Resilience Score and Chain-Based Attack Rejection Time Score.<n>We also introduce a dataset customized for the benchmark, which incorporates both neutral and malicious prompts.
arXiv Detail & Related papers (2025-05-12T14:09:24Z)
AttackEval: How to Evaluate the Effectiveness of Jailbreak Attacking on Large Language Models [29.92550386563915]
We introduce an innovative framework that can help evaluate the effectiveness of jailbreak attacks on large language models. We present two distinct evaluation frameworks: a coarse-grained evaluation and a fine-grained evaluation. We develop a comprehensive ground truth dataset specifically tailored for jailbreak prompts.
arXiv Detail & Related papers (2024-01-17T06:42:44Z)
Towards Evaluating Transfer-based Attacks Systematically, Practically, and Fairly [79.07074710460012]
adversarial vulnerability of deep neural networks (DNNs) has drawn great attention. An increasing number of transfer-based methods have been developed to fool black-box DNN models. We establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods.
arXiv Detail & Related papers (2023-11-02T15:35:58Z)
Revisiting Transferable Adversarial Image Examples: Attack Categorization, Evaluation Guidelines, and New Insights [30.14129637790446]
Transferable adversarial examples raise critical security concerns in real-world, black-box attack scenarios. In this work, we identify two main problems in common evaluation practices. We provide the first large-scale evaluation of transferable adversarial examples on ImageNet.
arXiv Detail & Related papers (2023-10-18T10:06:42Z)
Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A Contemporary Survey [114.17568992164303]
Adrial attacks and defenses in machine learning and deep neural network have been gaining significant attention. This survey provides a comprehensive overview of the recent advancements in the field of adversarial attack and defense techniques. New avenues of attack are also explored, including search-based, decision-based, drop-based, and physical-world attacks.
arXiv Detail & Related papers (2023-03-11T04:19:31Z)
MEAD: A Multi-Armed Approach for Evaluation of Adversarial Examples Detectors [24.296350262025552]
We propose a novel framework, called MEAD, for evaluating detectors based on several attack strategies. Among them, we make use of three new objectives to generate attacks. The proposed performance metric is based on the worst-case scenario.
arXiv Detail & Related papers (2022-06-30T17:05:45Z)
Deep-Attack over the Deep Reinforcement Learning [26.272161868927004]
adversarial attack developments have made reinforcement learning more vulnerable. We propose a reinforcement learning-based attacking framework by considering the effectiveness and stealthy spontaneously. We also propose a new metric to evaluate the performance of the attack model in these two aspects.
arXiv Detail & Related papers (2022-05-02T10:58:19Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
Random Projections for Adversarial Attack Detection [8.684378639046644]
adversarial attack detection remains a fundamentally challenging problem from two perspectives. We present a technique that makes use of special properties of random projections, whereby we can characterize the behavior of clean and adversarial examples. Performance evaluation demonstrates that our technique outperforms ($>0.92$ AUC) competing state of the art (SOTA) attack strategies.
arXiv Detail & Related papers (2020-12-11T15:02:28Z)
Guided Adversarial Attack for Evaluating and Enhancing Adversarial Defenses [59.58128343334556]
We introduce a relaxation term to the standard loss, that finds more suitable gradient-directions, increases attack efficacy and leads to more efficient adversarial training. We propose Guided Adversarial Margin Attack (GAMA), which utilizes function mapping of the clean image to guide the generation of adversaries. We also propose Guided Adversarial Training (GAT), which achieves state-of-the-art performance amongst single-step defenses.
arXiv Detail & Related papers (2020-11-30T16:39:39Z)
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function. We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z)
On Adaptive Attacks to Adversarial Example Defenses [123.32678153377915]
This paper lays out the methodology and the approach necessary to perform an adaptive attack against defenses to adversarial examples. We hope that these analyses will serve as guidance on how to properly perform adaptive attacks against defenses to adversarial examples.
arXiv Detail & Related papers (2020-02-19T18:50:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.