Related papers: Detecting Localized Adversarial Examples: A Generic Approach using Critical Region Analysis

Detecting Localized Adversarial Examples: A Generic Approach using Critical Region Analysis

URL: http://arxiv.org/abs/2102.05241v1
Date: Wed, 10 Feb 2021 03:31:16 GMT
Title: Detecting Localized Adversarial Examples: A Generic Approach using Critical Region Analysis
Authors: Fengting Li, Xuankai Liu, Xiaoli Zhang, Qi Li, Kun Sun, Kang Li
Abstract summary: We propose a generic defense system called TaintRadar to accurately detect localized adversarial examples. Compared with existing defense solutions,TaintRadar can effectively capture sophisticated localized partial attacks. Comprehensive experiments have been conducted in both digital and physical worlds to verify the effectiveness and robustness of our defense.
Score: 19.352676977713966
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks (DNNs) have been applied in a wide range of applications,e.g.,face recognition and image classification;however,they are vulnerable to adversarial examples.By adding a small amount of imperceptible perturbations,an attacker can easily manipulate the outputs of a DNN.Particularly,the localized adversarial examples only perturb a small and contiguous region of the target object,so that they are robust and effective in both digital and physical worlds.Although the localized adversarial examples have more severe real-world impacts than traditional pixel attacks,they have not been well addressed in the literature.In this paper,we propose a generic defense system called TaintRadar to accurately detect localized adversarial examples via analyzing critical regions that have been manipulated by attackers.The main idea is that when removing critical regions from input images,the ranking changes of adversarial labels will be larger than those of benign labels.Compared with existing defense solutions,TaintRadar can effectively capture sophisticated localized partial attacks, e.g.,the eye-glasses attack,while not requiring additional training or fine-tuning of the original model's structure.Comprehensive experiments have been conducted in both digital and physical worlds to verify the effectiveness and robustness of our defense.

Related papers

Detecting Adversarial Examples [24.585379549997743]
We propose a novel method to detect adversarial examples by analyzing the layer outputs of Deep Neural Networks. Our method is highly effective, compatible with any DNN architecture, and applicable across different domains, such as image, video, and audio.
arXiv Detail & Related papers (2024-10-22T21:42:59Z)
Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds [62.94859179323329]
Adrial attack methods based on point manipulation for 3D point cloud classification have revealed the fragility of 3D models. We propose a novel shape-based adversarial attack method, HiT-ADV, which conducts a two-stage search for attack regions based on saliency and imperceptibility perturbation scores. We propose that by employing benign resampling and benign rigid transformations, we can further enhance physical adversarial strength with little sacrifice to imperceptibility.
arXiv Detail & Related papers (2024-03-08T12:08:06Z)
Unfolding Local Growth Rate Estimates for (Almost) Perfect Adversarial Detection [22.99930028876662]
Convolutional neural networks (CNN) define the state-of-the-art solution on many perceptual tasks. Current CNN approaches largely remain vulnerable against adversarial perturbations of the input that have been crafted specifically to fool the system. We propose a simple and light-weight detector, which leverages recent findings on the relation between networks' local intrinsic dimensionality (LID) and adversarial attacks.
arXiv Detail & Related papers (2022-12-13T17:51:32Z)
On Trace of PGD-Like Adversarial Attacks [77.75152218980605]
Adversarial attacks pose safety and security concerns for deep learning applications. We construct Adrial Response Characteristics (ARC) features to reflect the model's gradient consistency. Our method is intuitive, light-weighted, non-intrusive, and data-undemanding.
arXiv Detail & Related papers (2022-05-19T14:26:50Z)
On the Real-World Adversarial Robustness of Real-Time Semantic Segmentation Models for Autonomous Driving [59.33715889581687]
The existence of real-world adversarial examples (commonly in the form of patches) poses a serious threat for the use of deep learning models in safety-critical computer vision tasks. This paper presents an evaluation of the robustness of semantic segmentation models when attacked with different types of adversarial patches. A novel loss function is proposed to improve the capabilities of attackers in inducing a misclassification of pixels.
arXiv Detail & Related papers (2022-01-05T22:33:43Z)
Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation. ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations. The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z)
Detecting Adversarial Examples by Input Transformations, Defense Perturbations, and Voting [71.57324258813674]
convolutional neural networks (CNNs) have proved to reach super-human performance in visual recognition tasks. CNNs can easily be fooled by adversarial examples, i.e., maliciously-crafted images that force the networks to predict an incorrect output. This paper extensively explores the detection of adversarial examples via image transformations and proposes a novel methodology.
arXiv Detail & Related papers (2021-01-27T14:50:41Z)
Miss the Point: Targeted Adversarial Attack on Multiple Landmark Detection [29.83857022733448]
This paper is the first to study how fragile a CNN-based model on multiple landmark detection to adversarial perturbations. We propose a novel Adaptive Targeted Iterative FGSM attack against the state-of-the-art models in multiple landmark detection.
arXiv Detail & Related papers (2020-07-10T07:58:35Z)
Regional Image Perturbation Reduces $L_p$ Norms of Adversarial Examples While Maintaining Model-to-model Transferability [3.578666449629947]
We show that effective regional perturbations can be generated without resorting to complex methods. We develop a very simple regional adversarial perturbation attack method using cross-entropy sign. Our experiments on ImageNet with multiple models reveal that, on average, $76%$ of the generated adversarial examples maintain model-to-model transferability.
arXiv Detail & Related papers (2020-07-07T04:33:16Z)
A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems. This paper proposes a self-supervised adversarial training mechanism in the input space. It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.