Related papers: AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples

AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples

URL: http://arxiv.org/abs/2404.19460v1
Date: Tue, 30 Apr 2024 11:19:05 GMT
Title: AttackBench: Evaluating Gradient-based Attacks for Adversarial Examples
Authors: Antonio Emanuele Cinà, Jérôme Rony, Maura Pintor, Luca Demetrio, Ambra Demontis, Battista Biggio, Ismail Ben Ayed, Fabio Roli,
Abstract summary: Adrial examples are typically optimized with gradient-based attacks. Each is shown to outperform its predecessors using different experimental setups. This provides overly-optimistic and even biased evaluations.
Score: 26.37278338032268
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial examples are typically optimized with gradient-based attacks. While novel attacks are continuously proposed, each is shown to outperform its predecessors using different experimental setups, hyperparameter settings, and number of forward and backward calls to the target models. This provides overly-optimistic and even biased evaluations that may unfairly favor one particular attack over the others. In this work, we aim to overcome these limitations by proposing AttackBench, i.e., the first evaluation framework that enables a fair comparison among different attacks. To this end, we first propose a categorization of gradient-based attacks, identifying their main components and differences. We then introduce our framework, which evaluates their effectiveness and efficiency. We measure these characteristics by (i) defining an optimality metric that quantifies how close an attack is to the optimal solution, and (ii) limiting the number of forward and backward queries to the model, such that all attacks are compared within a given maximum query budget. Our extensive experimental analysis compares more than 100 attack implementations with a total of over 800 different configurations against CIFAR-10 and ImageNet models, highlighting that only very few attacks outperform all the competing approaches. Within this analysis, we shed light on several implementation issues that prevent many attacks from finding better solutions or running at all. We release AttackBench as a publicly available benchmark, aiming to continuously update it to include and evaluate novel gradient-based attacks for optimizing adversarial examples.

Related papers

Can Targeted Clean-Label Poisoning Attacks Generalize? [11.499065606209925]
We study whether targeted poisoning attacks can generalize to unknown variations of those targets. In particular, we explore diverse target variations, such as an object with varied viewpoints and an animal species with distinct appearances. Our method outperforms the cosine similarity-based attack by 20.95% in attack success rate with similar overall accuracy.
arXiv Detail & Related papers (2024-12-05T06:27:14Z)
Adversarial Attack Based on Prediction-Correction [8.467466998915018]
Deep neural networks (DNNs) are vulnerable to adversarial examples obtained by adding small perturbations to original examples. In this paper, a new prediction-correction (PC) based adversarial attack is proposed. In our proposed PC-based attack, some existing attack can be selected to produce a predicted example first, and then the predicted example and the current example are combined together to determine the added perturbations.
arXiv Detail & Related papers (2023-06-02T03:11:32Z)
GLOW: Global Layout Aware Attacks for Object Detection [27.46902978168904]
Adversarial attacks aim to perturb images such that a predictor outputs incorrect results. We present first approach that copes with various attack requests by generating global layout-aware adversarial attacks. In experiment, we design multiple types of attack requests and validate our ideas on MS validation set.
arXiv Detail & Related papers (2023-02-27T22:01:34Z)
MultiRobustBench: Benchmarking Robustness Against Multiple Attacks [86.70417016955459]
We present the first unified framework for considering multiple attacks against machine learning (ML) models. Our framework is able to model different levels of learner's knowledge about the test-time adversary. We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types.
arXiv Detail & Related papers (2023-02-21T20:26:39Z)
Alternating Objectives Generates Stronger PGD-Based Adversarial Attacks [78.2700757742992]
Projected Gradient Descent (PGD) is one of the most effective and conceptually simple algorithms to generate such adversaries. We experimentally verify this assertion on a synthetic-data example and by evaluating our proposed method across 25 different $ell_infty$-robust models and 3 datasets. Our strongest adversarial attack outperforms all of the white-box components of AutoAttack ensemble.
arXiv Detail & Related papers (2022-12-15T17:44:31Z)
Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage. Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack. We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z)
Query-Efficient and Scalable Black-Box Adversarial Attacks on Discrete Sequential Data via Bayesian Optimization [10.246596695310176]
We focus on the problem of adversarial attacks against models on discrete sequential data in the black-box setting. We propose a query-efficient black-box attack using Bayesian optimization, which dynamically computes important positions. We develop a post-optimization algorithm that finds adversarial examples with smaller perturbation size.
arXiv Detail & Related papers (2022-06-17T06:11:36Z)
Practical Evaluation of Adversarial Robustness via Adaptive Auto Attack [96.50202709922698]
A practical evaluation method should be convenient (i.e., parameter-free), efficient (i.e., fewer iterations) and reliable. We propose a parameter-free Adaptive Auto Attack (A$3$) evaluation method which addresses the efficiency and reliability in a test-time-training fashion.
arXiv Detail & Related papers (2022-03-10T04:53:54Z)
Stochastic Variance Reduced Ensemble Adversarial Attack for Boosting the Adversarial Transferability [20.255708227671573]
Black-box adversarial attacks can be transferred from one model to another. In this work, we propose a novel ensemble attack method called the variance reduced ensemble attack. Empirical results on the standard ImageNet demonstrate that the proposed method could boost the adversarial transferability and outperforms existing ensemble attacks significantly.
arXiv Detail & Related papers (2021-11-21T06:33:27Z)
Towards A Conceptually Simple Defensive Approach for Few-shot classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks. We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering. Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z)
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function. We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.