ODDR: Outlier Detection & Dimension Reduction Based Defense Against
Adversarial Patches
- URL: http://arxiv.org/abs/2311.12084v1
- Date: Mon, 20 Nov 2023 11:08:06 GMT
- Title: ODDR: Outlier Detection & Dimension Reduction Based Defense Against
Adversarial Patches
- Authors: Nandish Chattopadhyay, Amira Guesmi, Muhammad Abdullah Hanif, Bassem
Ouni, Muhammad Shafique
- Abstract summary: Adversarial attacks are a major deterrent towards the reliable use of machine learning models.
We introduce Outlier Detection and Dimension Reduction (ODDR), a holistic defense mechanism designed to effectively mitigate patch-based adversarial attacks.
ODDR employs a three-stage pipeline: Fragmentation, Segregation, and Neutralization, providing a model-agnostic solution applicable to both image classification and object detection tasks.
- Score: 4.672978217020929
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks are a major deterrent towards the reliable use of machine
learning models. A powerful type of adversarial attacks is the patch-based
attack, wherein the adversarial perturbations modify localized patches or
specific areas within the images to deceive the trained machine learning model.
In this paper, we introduce Outlier Detection and Dimension Reduction (ODDR), a
holistic defense mechanism designed to effectively mitigate patch-based
adversarial attacks. In our approach, we posit that input features
corresponding to adversarial patches, whether naturalistic or otherwise,
deviate from the inherent distribution of the remaining image sample and can be
identified as outliers or anomalies. ODDR employs a three-stage pipeline:
Fragmentation, Segregation, and Neutralization, providing a model-agnostic
solution applicable to both image classification and object detection tasks.
The Fragmentation stage parses the samples into chunks for the subsequent
Segregation process. Here, outlier detection techniques identify and segregate
the anomalous features associated with adversarial perturbations. The
Neutralization stage utilizes dimension reduction methods on the outliers to
mitigate the impact of adversarial perturbations without sacrificing pertinent
information necessary for the machine learning task. Extensive testing on
benchmark datasets and state-of-the-art adversarial patches demonstrates the
effectiveness of ODDR. Results indicate robust accuracies matching and lying
within a small range of clean accuracies (1%-3% for classification and 3%-5%
for object detection), with only a marginal compromise of 1%-2% in performance
on clean samples, thereby significantly outperforming other defenses.
Related papers
- Anomaly Unveiled: Securing Image Classification against Adversarial
Patch Attacks [3.6275442368775512]
Adversarial patch attacks pose a significant threat to the practical deployment of deep learning systems.
In this paper, we investigate the behavior of adversarial patches as anomalies within the distribution of image information.
Our proposed defense mechanism utilizes a clustering-based technique called DBSCAN to isolate anomalous image segments.
arXiv Detail & Related papers (2024-02-09T08:52:47Z) - Adversarial Purification of Information Masking [8.253834429336656]
Adrial attacks generate minuscule, imperceptible perturbations to images to deceive neural networks.
Counteracting these, adversarial purification methods seek to transform adversarial input samples into clean output images to defend against adversarial attacks.
We propose a novel adversarial purification approach named Information Mask Purification (IMPure) to extensively eliminate adversarial perturbations.
arXiv Detail & Related papers (2023-11-26T15:50:19Z) - Uncertainty-based Detection of Adversarial Attacks in Semantic
Segmentation [16.109860499330562]
We introduce an uncertainty-based approach for the detection of adversarial attacks in semantic segmentation.
We demonstrate the ability of our approach to detect perturbed images across multiple types of adversarial attacks.
arXiv Detail & Related papers (2023-05-22T08:36:35Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Exploring Robustness of Unsupervised Domain Adaptation in Semantic
Segmentation [74.05906222376608]
We propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space.
This paper is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks.
arXiv Detail & Related papers (2021-05-23T01:50:44Z) - Towards Adversarial Patch Analysis and Certified Defense against Crowd
Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems.
Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks.
We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z) - Detection of Adversarial Supports in Few-shot Classifiers Using Feature
Preserving Autoencoders and Self-Similarity [89.26308254637702]
We propose a detection strategy to highlight adversarial support sets.
We make use of feature preserving autoencoder filtering and also the concept of self-similarity of a support set to perform this detection.
Our method is attack-agnostic and also the first to explore detection for few-shot classifiers to the best of our knowledge.
arXiv Detail & Related papers (2020-12-09T14:13:41Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - ATRO: Adversarial Training with a Rejection Option [10.36668157679368]
This paper proposes a classification framework with a rejection option to mitigate the performance deterioration caused by adversarial examples.
Applying the adversarial training objective to both a classifier and a rejection function simultaneously, we can choose to abstain from classification when it has insufficient confidence to classify a test data point.
arXiv Detail & Related papers (2020-10-24T14:05:03Z) - FADER: Fast Adversarial Example Rejection [19.305796826768425]
Recent defenses have been shown to improve adversarial robustness by detecting anomalous deviations from legitimate training samples at different layer representations.
We introduce FADER, a novel technique for speeding up detection-based methods.
Our experiments outline up to 73x prototypes reduction compared to analyzed detectors for MNIST dataset and up to 50x for CIFAR10 respectively.
arXiv Detail & Related papers (2020-10-18T22:00:11Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.