On the Efficacy of Metrics to Describe Adversarial Attacks
- URL: http://arxiv.org/abs/2301.13028v1
- Date: Mon, 30 Jan 2023 16:15:40 GMT
- Title: On the Efficacy of Metrics to Describe Adversarial Attacks
- Authors: Tommaso Puccetti, Tommaso Zoppi, Andrea Ceccarelli
- Abstract summary: Adversarial defenses are naturally evaluated on their ability to tolerate adversarial attacks.
To test defenses, diverse adversarial attacks are crafted, that are usually described in terms of their evading capability and the L0, L1, L2, and Linf norms.
We question if the evading capability and L-norms are the most effective information to claim that defenses have been tested against a representative attack set.
- Score: 3.867363075280544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial defenses are naturally evaluated on their ability to tolerate
adversarial attacks. To test defenses, diverse adversarial attacks are crafted,
that are usually described in terms of their evading capability and the L0, L1,
L2, and Linf norms. We question if the evading capability and L-norms are the
most effective information to claim that defenses have been tested against a
representative attack set. To this extent, we select image quality metrics from
the state of the art and search correlations between image perturbation and
detectability. We observe that computing L-norms alone is rarely the preferable
solution. We observe a strong correlation between the identified metrics
computed on an adversarial image and the output of a detector on such an image,
to the extent that they can predict the response of a detector with
approximately 0.94 accuracy. Further, we observe that metrics can classify
attacks based on similar perturbations and similar detectability. This suggests
a possible review of the approach to evaluate detectors, where additional
metrics are included to assure that a representative attack dataset is
selected.
Related papers
- AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors [0.0]
An adaptive attack is one where the attacker is aware of the defenses and adapts their strategy accordingly.
Our proposed method leverages adversarial training to reinforce the ability to detect attacks, without compromising clean accuracy.
Experimental evaluations on the CIFAR-10 and SVHN datasets demonstrate that our proposed algorithm significantly improves a detector's ability to accurately identify adaptive adversarial attacks.
arXiv Detail & Related papers (2024-04-18T12:13:09Z) - Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception.
We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception.
We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z) - Identifying Adversarially Attackable and Robust Samples [1.4213973379473654]
Adrial attacks insert small, imperceptible perturbations to input samples that cause large, undesired changes to the output of deep learning models.
This work introduces the notion of sample attackability, where we aim to identify samples that are most susceptible to adversarial attacks.
We propose a deep-learning-based detector to identify the adversarially attackable and robust samples in an unseen dataset for an unseen target model.
arXiv Detail & Related papers (2023-01-30T13:58:14Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Illusory Attacks: Information-Theoretic Detectability Matters in Adversarial Attacks [76.35478518372692]
We introduce epsilon-illusory, a novel form of adversarial attack on sequential decision-makers.
Compared to existing attacks, we empirically find epsilon-illusory to be significantly harder to detect with automated methods.
Our findings suggest the need for better anomaly detectors, as well as effective hardware- and system-level defenses.
arXiv Detail & Related papers (2022-07-20T19:49:09Z) - Attack-Agnostic Adversarial Detection [13.268960384729088]
We quantify the statistical deviation caused by adversarial agnostics in two aspects.
We show that our method can achieve an overall ROC AUC of 94.9%, 89.7%, and 94.6% on CIFAR10, CIFAR100, and SVHN, respectively, and has comparable performance to adversarial detectors trained with adversarial examples on most of the attacks.
arXiv Detail & Related papers (2022-06-01T13:41:40Z) - On Trace of PGD-Like Adversarial Attacks [77.75152218980605]
Adversarial attacks pose safety and security concerns for deep learning applications.
We construct Adrial Response Characteristics (ARC) features to reflect the model's gradient consistency.
Our method is intuitive, light-weighted, non-intrusive, and data-undemanding.
arXiv Detail & Related papers (2022-05-19T14:26:50Z) - Zero-Query Transfer Attacks on Context-Aware Object Detectors [95.18656036716972]
Adversarial attacks perturb images such that a deep neural network produces incorrect classification results.
A promising approach to defend against adversarial attacks on natural multi-object scenes is to impose a context-consistency check.
We present the first approach for generating context-consistent adversarial attacks that can evade the context-consistency check.
arXiv Detail & Related papers (2022-03-29T04:33:06Z) - Adversarial Detection and Correction by Matching Prediction
Distributions [0.0]
The detector almost completely neutralises powerful attacks like Carlini-Wagner or SLIDE on MNIST and Fashion-MNIST.
We show that our method is still able to detect the adversarial examples in the case of a white-box attack where the attacker has full knowledge of both the model and the defence.
arXiv Detail & Related papers (2020-02-21T15:45:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.