Related papers: Are classical deep neural networks weakly adversarially robust?

Are classical deep neural networks weakly adversarially robust?

URL: http://arxiv.org/abs/2506.02016v1
Date: Wed, 28 May 2025 06:58:05 GMT
Title: Are classical deep neural networks weakly adversarially robust?
Authors: Nuolin Sun, Linyuan Wang, Dongyang Li, Bin Yan, Lei Li,
Abstract summary: Adversarial attacks have received increasing attention and it has been widely recognized that classical DNNs have weak adversarial robustness.<n>We propose a method for adversarial example detection and image recognition that uses layer-wise features to construct feature paths.<n>Compared to the adversarial training method with 77.64% clean accuracy and 52.94% adversarial accuracy, our method exhibits a trade-off without relying on computationally expensive defense strategies.
Score: 14.11659285300135
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial attacks have received increasing attention and it has been widely recognized that classical DNNs have weak adversarial robustness. The most commonly used adversarial defense method, adversarial training, improves the adversarial accuracy of DNNs by generating adversarial examples and retraining the model. However, adversarial training requires a significant computational overhead. In this paper, inspired by existing studies focusing on the clustering properties of DNN output features at each layer and the Progressive Feedforward Collapse phenomenon, we propose a method for adversarial example detection and image recognition that uses layer-wise features to construct feature paths and computes the correlation between the examples feature paths and the class-centered feature paths. Experimental results show that the recognition method achieves 82.77% clean accuracy and 44.17% adversarial accuracy on the ResNet-20 with PFC. Compared to the adversarial training method with 77.64% clean accuracy and 52.94% adversarial accuracy, our method exhibits a trade-off without relying on computationally expensive defense strategies. Furthermore, on the standard ResNet-18, our method maintains this advantage with respective metrics of 80.01% and 46.1%. This result reveals inherent adversarial robustness in DNNs, challenging the conventional understanding of the weak adversarial robustness in DNNs.

Related papers

Nearest Neighbor Projection Removal Adversarial Training [5.146355145217634]
We introduce a novel adversarial training framework that actively mitigates inter-class proximity by projecting out inter-class dependencies from adversarial and clean samples.<n>Our approach first identifies the nearest inter-class neighbors for each adversarial sample and subsequently removes projections onto these neighbors to enforce stronger feature separability.
arXiv Detail & Related papers (2025-09-09T12:38:41Z)
Towards Adversarial Robustness via Debiased High-Confidence Logit Alignment [24.577363665112706]
Under inverse adversarial attacks, high-confidence outputs are influenced by biased feature activations.<n>This spurious correlation bias leads to overfitting irrelevant background features during adversarial training.<n>We propose Debiased High-Confidence Adversarial Training (DHAT), a novel approach that aligns adversarial logits with debiased high-confidence logits.
arXiv Detail & Related papers (2024-08-12T11:56:06Z)
Robust Overfitting Does Matter: Test-Time Adversarial Purification With FGSM [5.592360872268223]
Defense strategies usually train deep neural networks (DNNs) for a specific adversarial attack method and can achieve good robustness in defense against this type of adversarial attack. However, when subjected to evaluations involving unfamiliar attack modalities, empirical evidence reveals a pronounced deterioration in the robustness of DNNs. Most defense methods often sacrifice the accuracy of clean examples in order to improve the adversarial robustness of DNNs.
arXiv Detail & Related papers (2024-03-18T03:54:01Z)
Boosting Adversarial Robustness From The Perspective of Effective Margin Regularization [58.641705224371876]
The adversarial vulnerability of deep neural networks (DNNs) has been actively investigated in the past several years. This paper investigates the scale-variant property of cross-entropy loss, which is the most commonly used loss function in classification tasks. We show that the proposed effective margin regularization (EMR) learns large effective margins and boosts the adversarial robustness in both standard and adversarial training.
arXiv Detail & Related papers (2022-10-11T03:16:56Z)
Decorrelative Network Architecture for Robust Electrocardiogram Classification [4.808817930937323]
It is not possible to train networks that are accurate in all scenarios. Deep learning methods sample the model parameter space to estimate uncertainty. These parameters are often subject to the same vulnerabilities, which can be exploited by adversarial attacks. We propose a novel ensemble approach based on feature decorrelation and Fourier partitioning for teaching networks diverse complementary features.
arXiv Detail & Related papers (2022-07-19T02:36:36Z)
Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training. We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z)
Robust Sensible Adversarial Learning of Deep Neural Networks for Image Classification [6.594522185216161]
We introduce sensible adversarial learning and demonstrate the synergistic effect between pursuits of standard natural accuracy and robustness. Specifically, we define a sensible adversary which is useful for learning a robust model while keeping high natural accuracy. We propose a novel and efficient algorithm that trains a robust model using implicit loss truncation.
arXiv Detail & Related papers (2022-05-20T22:57:44Z)
Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance. Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z)
Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks. This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy. Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z)
To be Robust or to be Fair: Towards Fairness in Adversarial Training [83.42241071662897]
We find that adversarial training algorithms tend to introduce severe disparity of accuracy and robustness between different groups of data. We propose a Fair-Robust-Learning (FRL) framework to mitigate this unfairness problem when doing adversarial defenses.
arXiv Detail & Related papers (2020-10-13T02:21:54Z)
Improving adversarial robustness of deep neural networks by using semantic information [17.887586209038968]
Adrial training is the main method for improving adversarial robustness and the first line of defense against adversarial attacks. This paper provides a new perspective on the issue of adversarial robustness, one that shifts the focus from the network as a whole to the critical part of the region close to the decision boundary corresponding to a given class. Experimental results on the MNIST and CIFAR-10 datasets show that this approach greatly improves adversarial robustness even using a very small dataset from the training data.
arXiv Detail & Related papers (2020-08-18T10:23:57Z)
Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness [97.67477497115163]
We use mode connectivity to study the adversarial robustness of deep neural networks. Our experiments cover various types of adversarial attacks applied to different network architectures and datasets. Our results suggest that mode connectivity offers a holistic tool and practical means for evaluating and improving adversarial robustness.
arXiv Detail & Related papers (2020-04-30T19:12:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.