SAD: Saliency-based Defenses Against Adversarial Examples
- URL: http://arxiv.org/abs/2003.04820v1
- Date: Tue, 10 Mar 2020 15:55:23 GMT
- Title: SAD: Saliency-based Defenses Against Adversarial Examples
- Authors: Richard Tran, David Patrick, Michael Geyer, Amanda Fernandez
- Abstract summary: adversarial examples drift model predictions away from the original intent of the network.
In this work, we propose a visual saliency based approach to cleaning data affected by an adversarial attack.
- Score: 0.9786690381850356
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rise in popularity of machine and deep learning models, there is an
increased focus on their vulnerability to malicious inputs. These adversarial
examples drift model predictions away from the original intent of the network
and are a growing concern in practical security. In order to combat these
attacks, neural networks can leverage traditional image processing approaches
or state-of-the-art defensive models to reduce perturbations in the data.
Defensive approaches that take a global approach to noise reduction are
effective against adversarial attacks, however their lossy approach often
distorts important data within the image. In this work, we propose a visual
saliency based approach to cleaning data affected by an adversarial attack. Our
model leverages the salient regions of an adversarial image in order to provide
a targeted countermeasure while comparatively reducing loss within the cleaned
images. We measure the accuracy of our model by evaluating the effectiveness of
state-of-the-art saliency methods prior to attack, under attack, and after
application of cleaning methods. We demonstrate the effectiveness of our
proposed approach in comparison with related defenses and against established
adversarial attack methods, across two saliency datasets. Our targeted approach
shows significant improvements in a range of standard statistical and distance
saliency metrics, in comparison with both traditional and state-of-the-art
approaches.
Related papers
- MirrorCheck: Efficient Adversarial Defense for Vision-Language Models [55.73581212134293]
We propose a novel, yet elegantly simple approach for detecting adversarial samples in Vision-Language Models.
Our method leverages Text-to-Image (T2I) models to generate images based on captions produced by target VLMs.
Empirical evaluations conducted on different datasets validate the efficacy of our approach.
arXiv Detail & Related papers (2024-06-13T15:55:04Z) - Mutual-modality Adversarial Attack with Semantic Perturbation [81.66172089175346]
We propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme.
Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.
arXiv Detail & Related papers (2023-12-20T05:06:01Z) - Enhancing Adversarial Attacks: The Similar Target Method [6.293148047652131]
adversarial examples pose a threat to deep neural networks' applications.
Deep neural networks are vulnerable to adversarial examples, posing a threat to the models' applications and raising security concerns.
We propose a similar targeted attack method named Similar Target(ST)
arXiv Detail & Related papers (2023-08-21T14:16:36Z) - Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial
Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically.
Our method learns the in adversarial attacks parameterized by a recurrent neural network.
We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z) - Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training [106.34722726264522]
A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise.
Pre-processing methods may suffer from the robustness degradation effect.
A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model.
We propose a method called Joint Adversarial Training based Pre-processing (JATP) defense.
arXiv Detail & Related papers (2021-06-10T01:45:32Z) - Adversarial Examples Detection beyond Image Space [88.7651422751216]
We find that there exists compliance between perturbations and prediction confidence, which guides us to detect few-perturbation attacks from the aspect of prediction confidence.
We propose a method beyond image space by a two-stream architecture, in which the image stream focuses on the pixel artifacts and the gradient stream copes with the confidence artifacts.
arXiv Detail & Related papers (2021-02-23T09:55:03Z) - Adversarial example generation with AdaBelief Optimizer and Crop
Invariance [8.404340557720436]
Adversarial attacks can be an important method to evaluate and select robust models in safety-critical applications.
We propose AdaBelief Iterative Fast Gradient Method (ABI-FGM) and Crop-Invariant attack Method (CIM) to improve the transferability of adversarial examples.
Our method has higher success rates than state-of-the-art gradient-based attack methods.
arXiv Detail & Related papers (2021-02-07T06:00:36Z) - Detection Defense Against Adversarial Attacks with Saliency Map [7.736844355705379]
It is well established that neural networks are vulnerable to adversarial examples, which are almost imperceptible on human vision.
Existing defenses are trend to harden the robustness of models against adversarial attacks.
We propose a novel method combined with additional noises and utilize the inconsistency strategy to detect adversarial examples.
arXiv Detail & Related papers (2020-09-06T13:57:17Z) - Temporal Sparse Adversarial Attack on Sequence-based Gait Recognition [56.844587127848854]
We demonstrate that the state-of-the-art gait recognition model is vulnerable to such attacks.
We employ a generative adversarial network based architecture to semantically generate adversarial high-quality gait silhouettes or video frames.
The experimental results show that if only one-fortieth of the frames are attacked, the accuracy of the target model drops dramatically.
arXiv Detail & Related papers (2020-02-22T10:08:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.