Detecting Adversarial Examples
- URL: http://arxiv.org/abs/2410.17442v1
- Date: Tue, 22 Oct 2024 21:42:59 GMT
- Title: Detecting Adversarial Examples
- Authors: Furkan Mumcu, Yasin Yilmaz,
- Abstract summary: We propose a novel method to detect adversarial examples by analyzing the layer outputs of Deep Neural Networks.
Our method is highly effective, compatible with any DNN architecture, and applicable across different domains, such as image, video, and audio.
- Score: 24.585379549997743
- License:
- Abstract: Deep Neural Networks (DNNs) have been shown to be vulnerable to adversarial examples. While numerous successful adversarial attacks have been proposed, defenses against these attacks remain relatively understudied. Existing defense approaches either focus on negating the effects of perturbations caused by the attacks to restore the DNNs' original predictions or use a secondary model to detect adversarial examples. However, these methods often become ineffective due to the continuous advancements in attack techniques. We propose a novel universal and lightweight method to detect adversarial examples by analyzing the layer outputs of DNNs. Through theoretical justification and extensive experiments, we demonstrate that our detection method is highly effective, compatible with any DNN architecture, and applicable across different domains, such as image, video, and audio.
Related papers
- Robust Overfitting Does Matter: Test-Time Adversarial Purification With FGSM [5.592360872268223]
Defense strategies usually train deep neural networks (DNNs) for a specific adversarial attack method and can achieve good robustness in defense against this type of adversarial attack.
However, when subjected to evaluations involving unfamiliar attack modalities, empirical evidence reveals a pronounced deterioration in the robustness of DNNs.
Most defense methods often sacrifice the accuracy of clean examples in order to improve the adversarial robustness of DNNs.
arXiv Detail & Related papers (2024-03-18T03:54:01Z) - AFLOW: Developing Adversarial Examples under Extremely Noise-limited
Settings [7.828994881163805]
deep neural networks (DNNs) are vulnerable to adversarial attacks.
We propose a novel Normalize Flow-based end-to-end attack framework, called AFLOW, to synthesize imperceptible adversarial examples.
Compared with existing methods, AFLOW exhibit superiority in imperceptibility, image quality and attack capability.
arXiv Detail & Related papers (2023-10-15T10:54:07Z) - Adversarial Examples Detection with Enhanced Image Difference Features
based on Local Histogram Equalization [20.132066800052712]
We propose an adversarial example detection framework based on a high-frequency information enhancement strategy.
This framework can effectively extract and amplify the feature differences between adversarial examples and normal examples.
arXiv Detail & Related papers (2023-05-08T03:14:01Z) - Adversarial Example Defense via Perturbation Grading Strategy [17.36107815256163]
Deep Neural Networks have been widely used in many fields.
adversarial examples have tiny perturbations and greatly mislead the correct judgment of DNNs.
Researchers have proposed various defense methods to protect DNNs.
This paper assigns different defense strategies to adversarial perturbations of different strengths by grading the perturbations on the input examples.
arXiv Detail & Related papers (2022-12-16T08:35:21Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - ADC: Adversarial attacks against object Detection that evade Context
consistency checks [55.8459119462263]
We show that even context consistency checks can be brittle to properly crafted adversarial examples.
We propose an adaptive framework to generate examples that subvert such defenses.
Our results suggest that how to robustly model context and check its consistency, is still an open problem.
arXiv Detail & Related papers (2021-10-24T00:25:09Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Searching for an Effective Defender: Benchmarking Defense against
Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples.
Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z) - Towards Defending against Adversarial Examples via Attack-Invariant
Features [147.85346057241605]
Deep neural networks (DNNs) are vulnerable to adversarial noise.
adversarial robustness can be improved by exploiting adversarial examples.
Models trained on seen types of adversarial examples generally cannot generalize well to unseen types of adversarial examples.
arXiv Detail & Related papers (2021-06-09T12:49:54Z) - Adversarial Examples Detection beyond Image Space [88.7651422751216]
We find that there exists compliance between perturbations and prediction confidence, which guides us to detect few-perturbation attacks from the aspect of prediction confidence.
We propose a method beyond image space by a two-stream architecture, in which the image stream focuses on the pixel artifacts and the gradient stream copes with the confidence artifacts.
arXiv Detail & Related papers (2021-02-23T09:55:03Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.