Anomaly Unveiled: Securing Image Classification against Adversarial
Patch Attacks
- URL: http://arxiv.org/abs/2402.06249v1
- Date: Fri, 9 Feb 2024 08:52:47 GMT
- Title: Anomaly Unveiled: Securing Image Classification against Adversarial
Patch Attacks
- Authors: Nandish Chattopadhyay, Amira Guesmi, and Muhammad Shafique
- Abstract summary: Adversarial patch attacks pose a significant threat to the practical deployment of deep learning systems.
In this paper, we investigate the behavior of adversarial patches as anomalies within the distribution of image information.
Our proposed defense mechanism utilizes a clustering-based technique called DBSCAN to isolate anomalous image segments.
- Score: 3.6275442368775512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial patch attacks pose a significant threat to the practical
deployment of deep learning systems. However, existing research primarily
focuses on image pre-processing defenses, which often result in reduced
classification accuracy for clean images and fail to effectively counter
physically feasible attacks. In this paper, we investigate the behavior of
adversarial patches as anomalies within the distribution of image information
and leverage this insight to develop a robust defense strategy. Our proposed
defense mechanism utilizes a clustering-based technique called DBSCAN to
isolate anomalous image segments, which is carried out by a three-stage
pipeline consisting of Segmenting, Isolating, and Blocking phases to identify
and mitigate adversarial noise. Upon identifying adversarial components, we
neutralize them by replacing them with the mean pixel value, surpassing
alternative replacement options. Our model-agnostic defense mechanism is
evaluated across multiple models and datasets, demonstrating its effectiveness
in countering various adversarial patch attacks in image classification tasks.
Our proposed approach significantly improves accuracy, increasing from 38.8\%
without the defense to 67.1\% with the defense against LaVAN and GoogleAp
attacks, surpassing prominent state-of-the-art methods such as LGS (53.86\%)
and Jujutsu (60\%)
Related papers
- MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - DefensiveDR: Defending against Adversarial Patches using Dimensionality Reduction [4.4100683691177816]
Adrial patch-based attacks have shown to be a major deterrent towards the reliable use of machine learning models.
We propose textitDefensiveDR, a practical mechanism using a dimensionality reduction technique to thwart such patch-based attacks.
arXiv Detail & Related papers (2023-11-20T22:01:31Z) - ODDR: Outlier Detection & Dimension Reduction Based Defense Against Adversarial Patches [4.4100683691177816]
Adversarial attacks present a significant challenge to the dependable deployment of machine learning models.
We propose Outlier Detection and Dimension Reduction (ODDR), a comprehensive defense strategy to counteract patch-based adversarial attacks.
Our approach is based on the observation that input features corresponding to adversarial patches can be identified as outliers.
arXiv Detail & Related papers (2023-11-20T11:08:06Z) - Uncertainty-based Detection of Adversarial Attacks in Semantic
Segmentation [16.109860499330562]
We introduce an uncertainty-based approach for the detection of adversarial attacks in semantic segmentation.
We demonstrate the ability of our approach to detect perturbed images across multiple types of adversarial attacks.
arXiv Detail & Related papers (2023-05-22T08:36:35Z) - Diffusion Models for Adversarial Purification [69.1882221038846]
Adrial purification refers to a class of defense methods that remove adversarial perturbations using a generative model.
We propose DiffPure that uses diffusion models for adversarial purification.
Our method achieves the state-of-the-art results, outperforming current adversarial training and adversarial purification methods.
arXiv Detail & Related papers (2022-05-16T06:03:00Z) - ScaleCert: Scalable Certified Defense against Adversarial Patches with
Sparse Superficial Layers [29.658969173796645]
We propose a certified defense methodology that achieves high provable robustness for high-resolution images.
We leverage the SIN-based compression techniques to significantly improve the certified accuracy.
Our experimental results show that the certified accuracy is increased from 36.3% to 60.4% on the ImageNet dataset.
arXiv Detail & Related papers (2021-10-27T02:05:00Z) - Towards Adversarial Patch Analysis and Certified Defense against Crowd
Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems.
Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks.
We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z) - Adversarial Examples Detection beyond Image Space [88.7651422751216]
We find that there exists compliance between perturbations and prediction confidence, which guides us to detect few-perturbation attacks from the aspect of prediction confidence.
We propose a method beyond image space by a two-stream architecture, in which the image stream focuses on the pixel artifacts and the gradient stream copes with the confidence artifacts.
arXiv Detail & Related papers (2021-02-23T09:55:03Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Detecting Patch Adversarial Attacks with Image Residuals [9.169947558498535]
A discriminator is trained to distinguish between clean and adversarial samples.
We show that the obtained residuals act as a digital fingerprint for adversarial attacks.
Results show that the proposed detection method generalizes to previously unseen, stronger attacks.
arXiv Detail & Related papers (2020-02-28T01:28:22Z) - (De)Randomized Smoothing for Certifiable Defense against Patch Attacks [136.79415677706612]
We introduce a certifiable defense against patch attacks that guarantees for a given image and patch attack size.
Our method is related to the broad class of randomized smoothing robustness schemes.
Our results effectively establish a new state-of-the-art of certifiable defense against patch attacks on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2020-02-25T08:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.