UnMask: Adversarial Detection and Defense Through Robust Feature
Alignment
- URL: http://arxiv.org/abs/2002.09576v2
- Date: Sat, 14 Nov 2020 20:21:11 GMT
- Title: UnMask: Adversarial Detection and Defense Through Robust Feature
Alignment
- Authors: Scott Freitas, Shang-Tse Chen, Zijie J. Wang, Duen Horng Chau
- Abstract summary: Deep learning models are being integrated into a wide range of high-impact, security-critical systems, from self-driving cars to medical diagnosis.
Recent research has demonstrated that many of these deep learning architectures are vulnerable to adversarial attacks.
We develop UnMask, an adversarial detection and defense framework based on robust feature alignment.
- Score: 12.245288683492255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models are being integrated into a wide range of high-impact,
security-critical systems, from self-driving cars to medical diagnosis.
However, recent research has demonstrated that many of these deep learning
architectures are vulnerable to adversarial attacks--highlighting the vital
need for defensive techniques to detect and mitigate these attacks before they
occur. To combat these adversarial attacks, we developed UnMask, an adversarial
detection and defense framework based on robust feature alignment. The core
idea behind UnMask is to protect these models by verifying that an image's
predicted class ("bird") contains the expected robust features (e.g., beak,
wings, eyes). For example, if an image is classified as "bird", but the
extracted features are wheel, saddle and frame, the model may be under attack.
UnMask detects such attacks and defends the model by rectifying the
misclassification, re-classifying the image based on its robust features. Our
extensive evaluation shows that UnMask (1) detects up to 96.75% of attacks, and
(2) defends the model by correctly classifying up to 93% of adversarial images
produced by the current strongest attack, Projected Gradient Descent, in the
gray-box setting. UnMask provides significantly better protection than
adversarial training across 8 attack vectors, averaging 31.18% higher accuracy.
We open source the code repository and data with this paper:
https://github.com/safreita1/unmask.
Related papers
- MASKDROID: Robust Android Malware Detection with Masked Graph Representations [56.09270390096083]
We propose MASKDROID, a powerful detector with a strong discriminative ability to identify malware.
We introduce a masking mechanism into the Graph Neural Network based framework, forcing MASKDROID to recover the whole input graph.
This strategy enables the model to understand the malicious semantics and learn more stable representations, enhancing its robustness against adversarial attacks.
arXiv Detail & Related papers (2024-09-29T07:22:47Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Improving behavior based authentication against adversarial attack using XAI [3.340314613771868]
We propose an eXplainable AI (XAI) based defense strategy against adversarial attacks in such scenarios.
A feature selector, trained with our method, can be used as a filter in front of the original authenticator.
We demonstrate that our XAI based defense strategy is effective against adversarial attacks and outperforms other defense strategies.
arXiv Detail & Related papers (2024-02-26T09:29:05Z) - The Best Defense is a Good Offense: Adversarial Augmentation against
Adversarial Attacks [91.56314751983133]
$A5$ is a framework to craft a defensive perturbation to guarantee that any attack towards the input in hand will fail.
We show effective on-the-fly defensive augmentation with a robustifier network that ignores the ground truth label.
We also show how to apply $A5$ to create certifiably robust physical objects.
arXiv Detail & Related papers (2023-05-23T16:07:58Z) - Mask and Restore: Blind Backdoor Defense at Test Time with Masked
Autoencoder [57.739693628523]
We propose a framework for blind backdoor defense with Masked AutoEncoder (BDMAE)
BDMAE detects possible triggers in the token space using image structural similarity and label consistency between the test image and MAE restorations.
Our approach is blind to the model restorations, trigger patterns and image benignity.
arXiv Detail & Related papers (2023-03-27T19:23:33Z) - MultiRobustBench: Benchmarking Robustness Against Multiple Attacks [86.70417016955459]
We present the first unified framework for considering multiple attacks against machine learning (ML) models.
Our framework is able to model different levels of learner's knowledge about the test-time adversary.
We evaluate the performance of 16 defended models for robustness against a set of 9 different attack types.
arXiv Detail & Related papers (2023-02-21T20:26:39Z) - Btech thesis report on adversarial attack detection and purification of
adverserially attacked images [0.0]
This thesis report is on detection and purification of adverserially attacked images.
A deep learning model is trained on certain training examples for various tasks such as classification, regression etc.
arXiv Detail & Related papers (2022-05-09T09:24:11Z) - Automating Defense Against Adversarial Attacks: Discovery of
Vulnerabilities and Application of Multi-INT Imagery to Protect Deployed
Models [0.0]
We evaluate the use of multi-spectral image arrays and ensemble learners to combat adversarial attacks.
In rough analogy to defending cyber-networks, we combine techniques from both offensive ("red team) and defensive ("blue team") approaches.
arXiv Detail & Related papers (2021-03-29T19:07:55Z) - Robust SleepNets [7.23389716633927]
In this study, we investigate eye closedness detection to prevent vehicle accidents related to driver disengagements and driver drowsiness.
We develop two models to detect eye closedness: first model on eye images and a second model on face images.
We adversarially attack the models with Projected Gradient Descent, Fast Gradient Sign and DeepFool methods and report adversarial success rate.
arXiv Detail & Related papers (2021-02-24T20:48:13Z) - An Empirical Review of Adversarial Defenses [0.913755431537592]
Deep neural networks, which form the basis of such systems, are highly susceptible to a specific type of attack, called adversarial attacks.
A hacker can, even with bare minimum computation, generate adversarial examples (images or data points that belong to another class, but consistently fool the model to get misclassified as genuine) and crumble the basis of such algorithms.
We show two effective techniques, namely Dropout and Denoising Autoencoders, and show their success in preventing such attacks from fooling the model.
arXiv Detail & Related papers (2020-12-10T09:34:41Z) - Attack Agnostic Adversarial Defense via Visual Imperceptible Bound [70.72413095698961]
This research aims to design a defense model that is robust within a certain bound against both seen and unseen adversarial attacks.
The proposed defense model is evaluated on the MNIST, CIFAR-10, and Tiny ImageNet databases.
The proposed algorithm is attack agnostic, i.e. it does not require any knowledge of the attack algorithm.
arXiv Detail & Related papers (2020-10-25T23:14:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.