Related papers: Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples

Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples

URL: http://arxiv.org/abs/2106.09947v1
Date: Fri, 18 Jun 2021 06:57:58 GMT
Title: Indicators of Attack Failure: Debugging and Improving Optimization of Adversarial Examples
Authors: Maura Pintor, Luca Demetrio, Angelo Sotgiu, Giovanni Manca, Ambra Demontis, Nicholas Carlini, Battista Biggio, Fabio Roli
Abstract summary: evaluating robustness of machine-learning models to adversarial examples is a challenging problem. We define a set of quantitative indicators which unveil common failures in the optimization of gradient-based attacks. Our experimental analysis shows that the proposed indicators of failure can be used to visualize, debug and improve current adversarial robustness evaluations.
Score: 29.385242714424624
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Evaluating robustness of machine-learning models to adversarial examples is a challenging problem. Many defenses have been shown to provide a false sense of security by causing gradient-based attacks to fail, and they have been broken under more rigorous evaluations. Although guidelines and best practices have been suggested to improve current adversarial robustness evaluations, the lack of automatic testing and debugging tools makes it difficult to apply these recommendations in a systematic manner. In this work, we overcome these limitations by (i) defining a set of quantitative indicators which unveil common failures in the optimization of gradient-based attacks, and (ii) proposing specific mitigation strategies within a systematic evaluation protocol. Our extensive experimental analysis shows that the proposed indicators of failure can be used to visualize, debug and improve current adversarial robustness evaluations, providing a first concrete step towards automatizing and systematizing current adversarial robustness evaluations. Our open-source code is available at: https://github.com/pralab/IndicatorsOfAttackFailure.

Related papers

Addressing Key Challenges of Adversarial Attacks and Defenses in the Tabular Domain: A Methodological Framework for Coherence and Consistency [26.645723217188323]
In this paper, we propose new evaluation criteria tailored for adversarial attacks in the tabular domain. We also introduce a novel technique for perturbing dependent features while maintaining coherence and feature consistency within the sample. The findings provide valuable insights on the strengths, limitations, and trade-offs of various adversarial attacks in the tabular domain.
arXiv Detail & Related papers (2024-12-10T09:17:09Z)
A practical approach to evaluating the adversarial distance for machine learning classifiers [2.2120851074630177]
This paper investigates the estimation of the more informative adversarial distance using iterative adversarial attacks and a certification approach. We find that our adversarial attack approach is effective compared to related implementations, while the certification method falls short of expectations.
arXiv Detail & Related papers (2024-09-05T14:57:01Z)
A Survey and Evaluation of Adversarial Attacks for Object Detection [11.48212060875543]
Deep learning models excel in various computer vision tasks but are susceptible to adversarial examples-subtle perturbations in input data that lead to incorrect predictions. This vulnerability poses significant risks in safety-critical applications such as autonomous vehicles, security surveillance, and aircraft health monitoring.
arXiv Detail & Related papers (2024-08-04T05:22:08Z)
STBA: Towards Evaluating the Robustness of DNNs for Query-Limited Black-box Scenario [50.37501379058119]
We propose the Spatial Transform Black-box Attack (STBA) to craft formidable adversarial examples in the query-limited scenario. We show that STBA could effectively improve the imperceptibility of the adversarial examples and remarkably boost the attack success rate under query-limited settings.
arXiv Detail & Related papers (2024-03-30T13:28:53Z)
Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification. We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations. Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z)
Towards Evaluating Transfer-based Attacks Systematically, Practically, and Fairly [79.07074710460012]
adversarial vulnerability of deep neural networks (DNNs) has drawn great attention. An increasing number of transfer-based methods have been developed to fool black-box DNN models. We establish a transfer-based attack benchmark (TA-Bench) which implements 30+ methods.
arXiv Detail & Related papers (2023-11-02T15:35:58Z)
From Adversarial Arms Race to Model-centric Evaluation: Motivating a Unified Automatic Robustness Evaluation Framework [91.94389491920309]
Textual adversarial attacks can discover models' weaknesses by adding semantic-preserved but misleading perturbations to the inputs. The existing practice of robustness evaluation may exhibit issues of incomprehensive evaluation, impractical evaluation protocol, and invalid adversarial samples. We set up a unified automatic robustness evaluation framework, shifting towards model-centric evaluation to exploit the advantages of adversarial attacks.
arXiv Detail & Related papers (2023-05-29T14:55:20Z)
MEAD: A Multi-Armed Approach for Evaluation of Adversarial Examples Detectors [24.296350262025552]
We propose a novel framework, called MEAD, for evaluating detectors based on several attack strategies. Among them, we make use of three new objectives to generate attacks. The proposed performance metric is based on the worst-case scenario.
arXiv Detail & Related papers (2022-06-30T17:05:45Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
Balancing detectability and performance of attacks on the control channel of Markov Decision Processes [77.66954176188426]
We investigate the problem of designing optimal stealthy poisoning attacks on the control channel of Markov decision processes (MDPs) This research is motivated by the recent interest of the research community for adversarial and poisoning attacks applied to MDPs, and reinforcement learning (RL) methods.
arXiv Detail & Related papers (2021-09-15T09:13:10Z)
Unknown Presentation Attack Detection against Rational Attackers [6.351869353952288]
Presentation attack detection and multimedia forensics are still vulnerable to attacks in real-life settings. Some of the challenges for existing solutions are the detection of unknown attacks, the ability to perform in adversarial settings, few-shot learning, and explainability. New optimization criterion is proposed and a set of requirements are defined for improving the performance of these systems in real-life settings.
arXiv Detail & Related papers (2020-10-04T14:37:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.