Attack-Agnostic Adversarial Detection
- URL: http://arxiv.org/abs/2206.00489v1
- Date: Wed, 1 Jun 2022 13:41:40 GMT
- Title: Attack-Agnostic Adversarial Detection
- Authors: Jiaxin Cheng, Mohamed Hussein, Jay Billa and Wael AbdAlmageed
- Abstract summary: We quantify the statistical deviation caused by adversarial agnostics in two aspects.
We show that our method can achieve an overall ROC AUC of 94.9%, 89.7%, and 94.6% on CIFAR10, CIFAR100, and SVHN, respectively, and has comparable performance to adversarial detectors trained with adversarial examples on most of the attacks.
- Score: 13.268960384729088
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The growing number of adversarial attacks in recent years gives attackers an
advantage over defenders, as defenders must train detectors after knowing the
types of attacks, and many models need to be maintained to ensure good
performance in detecting any upcoming attacks. We propose a way to end the
tug-of-war between attackers and defenders by treating adversarial attack
detection as an anomaly detection problem so that the detector is agnostic to
the attack. We quantify the statistical deviation caused by adversarial
perturbations in two aspects. The Least Significant Component Feature (LSCF)
quantifies the deviation of adversarial examples from the statistics of benign
samples and Hessian Feature (HF) reflects how adversarial examples distort the
landscape of the model's optima by measuring the local loss curvature.
Empirical results show that our method can achieve an overall ROC AUC of 94.9%,
89.7%, and 94.6% on CIFAR10, CIFAR100, and SVHN, respectively, and has
comparable performance to adversarial detectors trained with adversarial
examples on most of the attacks.
Related papers
- Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors [0.0]
An adaptive attack is one where the attacker is aware of the defenses and adapts their strategy accordingly.
Our proposed method leverages adversarial training to reinforce the ability to detect attacks, without compromising clean accuracy.
Experimental evaluations on the CIFAR-10 and SVHN datasets demonstrate that our proposed algorithm significantly improves a detector's ability to accurately identify adaptive adversarial attacks.
arXiv Detail & Related papers (2024-04-18T12:13:09Z) - Unraveling Adversarial Examples against Speaker Identification --
Techniques for Attack Detection and Victim Model Classification [24.501269108193412]
Adversarial examples have proven to threaten speaker identification systems.
We propose a method to detect the presence of adversarial examples.
We also introduce a method for identifying the victim model on which the adversarial attack is carried out.
arXiv Detail & Related papers (2024-02-29T17:06:52Z) - Malicious Agent Detection for Robust Multi-Agent Collaborative Perception [52.261231738242266]
Multi-agent collaborative (MAC) perception is more vulnerable to adversarial attacks than single-agent perception.
We propose Malicious Agent Detection (MADE), a reactive defense specific to MAC perception.
We conduct comprehensive evaluations on a benchmark 3D dataset V2X-sim and a real-road dataset DAIR-V2X.
arXiv Detail & Related papers (2023-10-18T11:36:42Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - On Trace of PGD-Like Adversarial Attacks [77.75152218980605]
Adversarial attacks pose safety and security concerns for deep learning applications.
We construct Adrial Response Characteristics (ARC) features to reflect the model's gradient consistency.
Our method is intuitive, light-weighted, non-intrusive, and data-undemanding.
arXiv Detail & Related papers (2022-05-19T14:26:50Z) - Adversarial Robustness of Deep Reinforcement Learning based Dynamic
Recommender Systems [50.758281304737444]
We propose to explore adversarial examples and attack detection on reinforcement learning-based interactive recommendation systems.
We first craft different types of adversarial examples by adding perturbations to the input and intervening on the casual factors.
Then, we augment recommendation systems by detecting potential attacks with a deep learning-based classifier based on the crafted data.
arXiv Detail & Related papers (2021-12-02T04:12:24Z) - Using Anomaly Feature Vectors for Detecting, Classifying and Warning of
Outlier Adversarial Examples [4.096598295525345]
We present DeClaW, a system for detecting, classifying, and warning of adversarial inputs presented to a classification neural network.
Preliminary findings suggest that AFVs can help distinguish among several types of adversarial attacks with close to 93% accuracy on the CIFAR-10 dataset.
arXiv Detail & Related papers (2021-07-01T16:00:09Z) - Random Projections for Adversarial Attack Detection [8.684378639046644]
adversarial attack detection remains a fundamentally challenging problem from two perspectives.
We present a technique that makes use of special properties of random projections, whereby we can characterize the behavior of clean and adversarial examples.
Performance evaluation demonstrates that our technique outperforms ($>0.92$ AUC) competing state of the art (SOTA) attack strategies.
arXiv Detail & Related papers (2020-12-11T15:02:28Z) - Investigating Robustness of Adversarial Samples Detection for Automatic
Speaker Verification [78.51092318750102]
This work proposes to defend ASV systems against adversarial attacks with a separate detection network.
A VGG-like binary classification detector is introduced and demonstrated to be effective on detecting adversarial samples.
arXiv Detail & Related papers (2020-06-11T04:31:56Z) - Reliable evaluation of adversarial robustness with an ensemble of
diverse parameter-free attacks [65.20660287833537]
In this paper we propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function.
We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.
arXiv Detail & Related papers (2020-03-03T18:15:55Z) - Adversarial Detection and Correction by Matching Prediction
Distributions [0.0]
The detector almost completely neutralises powerful attacks like Carlini-Wagner or SLIDE on MNIST and Fashion-MNIST.
We show that our method is still able to detect the adversarial examples in the case of a white-box attack where the attacker has full knowledge of both the model and the defence.
arXiv Detail & Related papers (2020-02-21T15:45:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.