Related papers: Robust Overfitting Does Matter: Test-Time Adversarial Purification With FGSM

Robust Overfitting Does Matter: Test-Time Adversarial Purification With FGSM

URL: http://arxiv.org/abs/2403.11448v1
Date: Mon, 18 Mar 2024 03:54:01 GMT
Title: Robust Overfitting Does Matter: Test-Time Adversarial Purification With FGSM
Authors: Linyu Tang, Lei Zhang,
Abstract summary: Defense strategies usually train deep neural networks (DNNs) for a specific adversarial attack method and can achieve good robustness in defense against this type of adversarial attack. However, when subjected to evaluations involving unfamiliar attack modalities, empirical evidence reveals a pronounced deterioration in the robustness of DNNs. Most defense methods often sacrifice the accuracy of clean examples in order to improve the adversarial robustness of DNNs.
Score: 5.592360872268223
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Numerous studies have demonstrated the susceptibility of deep neural networks (DNNs) to subtle adversarial perturbations, prompting the development of many advanced adversarial defense methods aimed at mitigating adversarial attacks. Current defense strategies usually train DNNs for a specific adversarial attack method and can achieve good robustness in defense against this type of adversarial attack. Nevertheless, when subjected to evaluations involving unfamiliar attack modalities, empirical evidence reveals a pronounced deterioration in the robustness of DNNs. Meanwhile, there is a trade-off between the classification accuracy of clean examples and adversarial examples. Most defense methods often sacrifice the accuracy of clean examples in order to improve the adversarial robustness of DNNs. To alleviate these problems and enhance the overall robust generalization of DNNs, we propose the Test-Time Pixel-Level Adversarial Purification (TPAP) method. This approach is based on the robust overfitting characteristic of DNNs to the fast gradient sign method (FGSM) on training and test datasets. It utilizes FGSM for adversarial purification, to process images for purifying unknown adversarial perturbations from pixels at testing time in a "counter changes with changelessness" manner, thereby enhancing the defense capability of DNNs against various unknown adversarial attacks. Extensive experimental results show that our method can effectively improve both overall robust generalization of DNNs, notably over previous methods.

Related papers

Universal and Efficient Detection of Adversarial Data through Nonuniform Impact on Network Layers [24.585379549997743]
Deep Neural Networks (DNNs) are notoriously vulnerable to adversarial input designs with limited noise budgets.<n>We show that the existing detection methods are either ineffective against the state-of-the-art attack techniques or computationally inefficient for real-time processing.<n>We propose a novel universal and efficient method to detect adversarial examples by analyzing the varying degrees of impact of attacks on different DNN layers.
arXiv Detail & Related papers (2025-06-25T20:30:28Z)
Are classical deep neural networks weakly adversarially robust? [14.11659285300135]
Adversarial attacks have received increasing attention and it has been widely recognized that classical DNNs have weak adversarial robustness.<n>We propose a method for adversarial example detection and image recognition that uses layer-wise features to construct feature paths.<n>Compared to the adversarial training method with 77.64% clean accuracy and 52.94% adversarial accuracy, our method exhibits a trade-off without relying on computationally expensive defense strategies.
arXiv Detail & Related papers (2025-05-28T06:58:05Z)
Detecting Adversarial Examples [24.585379549997743]
We propose a novel method to detect adversarial examples by analyzing the layer outputs of Deep Neural Networks. Our method is highly effective, compatible with any DNN architecture, and applicable across different domains, such as image, video, and audio.
arXiv Detail & Related papers (2024-10-22T21:42:59Z)
Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios. Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples. We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z)
Not So Robust After All: Evaluating the Robustness of Deep Neural Networks to Unseen Adversarial Attacks [5.024667090792856]
Deep neural networks (DNNs) have gained prominence in various applications, such as classification, recognition, and prediction. A fundamental attribute of traditional DNNs is their vulnerability to modifications in input data, which has resulted in the investigation of adversarial attacks. This study aims to challenge the efficacy and generalization of contemporary defense mechanisms against adversarial attacks.
arXiv Detail & Related papers (2023-08-12T05:21:34Z)
Denoising Autoencoder-based Defensive Distillation as an Adversarial Robustness Algorithm [0.0]
Adversarial attacks significantly threaten the robustness of deep neural networks (DNNs) This work proposes a novel method that combines the defensive distillation mechanism with a denoising autoencoder (DAE)
arXiv Detail & Related papers (2023-03-28T11:34:54Z)
Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training. We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z)
A Mask-Based Adversarial Defense Scheme [3.759725391906588]
Adversarial attacks hamper the functionality and accuracy of Deep Neural Networks (DNNs) We propose a new Mask-based Adversarial Defense scheme (MAD) for DNNs to mitigate the negative effect from adversarial attacks.
arXiv Detail & Related papers (2022-04-21T12:55:27Z)
Model-Agnostic Meta-Attack: Towards Reliable Evaluation of Adversarial Robustness [53.094682754683255]
We propose a Model-Agnostic Meta-Attack (MAMA) approach to discover stronger attack algorithms automatically. Our method learns the in adversarial attacks parameterized by a recurrent neural network. We develop a model-agnostic training algorithm to improve the ability of the learned when attacking unseen defenses.
arXiv Detail & Related papers (2021-10-13T13:54:24Z)
Searching for an Effective Defender: Benchmarking Defense against Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples. Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z)
Towards Adversarial Patch Analysis and Certified Defense against Crowd Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems. Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks. We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z)
A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems. This paper proposes a self-supervised adversarial training mechanism in the input space. It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.