Related papers: Does simple trump complex? Comparing strategies for adversarial robustness in DNNs

Does simple trump complex? Comparing strategies for adversarial robustness in DNNs

URL: http://arxiv.org/abs/2508.18019v1
Date: Mon, 25 Aug 2025 13:33:38 GMT
Title: Does simple trump complex? Comparing strategies for adversarial robustness in DNNs
Authors: William Brooks, Marelie H. Davel, Coenraad Mouton,
Abstract summary: Deep Neural Networks (DNNs) have shown substantial success in various applications but remain vulnerable to adversarial attacks.<n>This study aims to identify and isolate the components of two different adversarial training techniques that contribute most to increased adversarial robustness.
Score: 3.6130723421895947
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep Neural Networks (DNNs) have shown substantial success in various applications but remain vulnerable to adversarial attacks. This study aims to identify and isolate the components of two different adversarial training techniques that contribute most to increased adversarial robustness, particularly through the lens of margins in the input space -- the minimal distance between data points and decision boundaries. Specifically, we compare two methods that maximize margins: a simple approach which modifies the loss function to increase an approximation of the margin, and a more complex state-of-the-art method (Dynamics-Aware Robust Training) which builds upon this approach. Using a VGG-16 model as our base, we systematically isolate and evaluate individual components from these methods to determine their relative impact on adversarial robustness. We assess the effect of each component on the model's performance under various adversarial attacks, including AutoAttack and Projected Gradient Descent (PGD). Our analysis on the CIFAR-10 dataset reveals which elements most effectively enhance adversarial robustness, providing insights for designing more robust DNNs.

Related papers

GradID: Adversarial Detection via Intrinsic Dimensionality of Gradients [0.1019561860229868]
In this paper, we investigate the geometric properties of a model's input loss landscape.<n>We reveal a distinct and consistent difference in the ID for natural and adversarial data, which forms the basis of our proposed detection method.<n>Our detector significantly surpasses existing methods against a wide array of attacks, including CW and AutoAttack, achieving detection rates consistently above 92% on CIFAR-10.
arXiv Detail & Related papers (2025-12-14T20:16:03Z)
Adversarial Training in Low-Label Regimes with Margin-Based Interpolation [8.585017175426023]
Adversarial training has emerged as an effective approach to train robust neural network models that are resistant to adversarial attacks.<n>In this paper, we introduce a novel semi-supervised adversarial training approach that enhances both robustness and natural accuracy.
arXiv Detail & Related papers (2024-11-27T00:35:13Z)
Ensemble Adversarial Defense via Integration of Multiple Dispersed Low Curvature Models [7.8245455684263545]
In this work, we aim to enhance ensemble diversity by reducing attack transferability. We identify second-order gradients, which depict the loss curvature, as a key factor in adversarial robustness. We introduce a novel regularizer to train multiple more-diverse low-curvature network models.
arXiv Detail & Related papers (2024-03-25T03:44:36Z)
Preference Poisoning Attacks on Reward Model Learning [47.00395978031771]
We investigate the nature and extent of a vulnerability in learning reward models from pairwise comparisons. We propose two classes of algorithmic approaches for these attacks: a gradient-based framework, and several variants of rank-by-distance methods. We find that the best attacks are often highly successful, achieving in the most extreme case 100% success rate with only 0.3% of the data poisoned.
arXiv Detail & Related papers (2024-02-02T21:45:24Z)
PAIF: Perception-Aware Infrared-Visible Image Fusion for Attack-Tolerant Semantic Segmentation [50.556961575275345]
We propose a perception-aware fusion framework to promote segmentation robustness in adversarial scenes. We show that our scheme substantially enhances the robustness, with gains of 15.3% mIOU, compared with advanced competitors.
arXiv Detail & Related papers (2023-08-08T01:55:44Z)
Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework. Our importance weights are obtained by optimizing the KL-divergence regularized loss function. Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z)
A Comprehensive Study on Robustness of Image Classification Models: Benchmarking and Rethinking [54.89987482509155]
robustness of deep neural networks is usually lacking under adversarial examples, common corruptions, and distribution shifts. We establish a comprehensive benchmark robustness called textbfARES-Bench on the image classification task. By designing the training settings accordingly, we achieve the new state-of-the-art adversarial robustness.
arXiv Detail & Related papers (2023-02-28T04:26:20Z)
Differentiable Search of Accurate and Robust Architectures [22.435774101990752]
adversarial training has been drawing increasing attention because of its simplicity and effectiveness. Deep neural networks (DNNs) are found to be vulnerable to adversarial attacks. We propose DSARA to automatically search for the neural architectures that are accurate and robust after adversarial training.
arXiv Detail & Related papers (2022-12-28T08:36:36Z)
Alternating Objectives Generates Stronger PGD-Based Adversarial Attacks [78.2700757742992]
Projected Gradient Descent (PGD) is one of the most effective and conceptually simple algorithms to generate such adversaries. We experimentally verify this assertion on a synthetic-data example and by evaluating our proposed method across 25 different $ell_infty$-robust models and 3 datasets. Our strongest adversarial attack outperforms all of the white-box components of AutoAttack ensemble.
arXiv Detail & Related papers (2022-12-15T17:44:31Z)
Causal Information Bottleneck Boosts Adversarial Robustness of Deep Neural Network [3.819052032134146]
The information bottleneck (IB) method is a feasible defense solution against adversarial attacks in deep learning. We incorporate the causal inference into the IB framework to alleviate such a problem. Our method exhibits the considerable robustness against multiple adversarial attacks.
arXiv Detail & Related papers (2022-10-25T12:49:36Z)
Resisting Adversarial Attacks in Deep Neural Networks using Diverse Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify. We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model. We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z)
Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance. Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z)
Improving robustness of jet tagging algorithms with adversarial training [56.79800815519762]
We investigate the vulnerability of flavor tagging algorithms via application of adversarial attacks. We present an adversarial training strategy that mitigates the impact of such simulated attacks.
arXiv Detail & Related papers (2022-03-25T19:57:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.