Improving Adversarial Training using Vulnerability-Aware Perturbation
Budget
- URL: http://arxiv.org/abs/2403.04070v1
- Date: Wed, 6 Mar 2024 21:50:52 GMT
- Title: Improving Adversarial Training using Vulnerability-Aware Perturbation
Budget
- Authors: Olukorede Fakorede, Modeste Atsague, Jin Tian
- Abstract summary: Adversarial Training (AT) effectively improves the robustness of Deep Neural Networks (DNNs) to adversarial attacks.
We propose two simple, computationally cheap vulnerability-aware reweighting functions for assigning perturbation bounds to adversarial examples used for AT.
Experimental results show that the proposed methods yield genuine improvements in the robustness of AT algorithms against various adversarial attacks.
- Score: 7.430861908931903
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adversarial Training (AT) effectively improves the robustness of Deep Neural
Networks (DNNs) to adversarial attacks. Generally, AT involves training DNN
models with adversarial examples obtained within a pre-defined, fixed
perturbation bound. Notably, individual natural examples from which these
adversarial examples are crafted exhibit varying degrees of intrinsic
vulnerabilities, and as such, crafting adversarial examples with fixed
perturbation radius for all instances may not sufficiently unleash the potency
of AT. Motivated by this observation, we propose two simple, computationally
cheap vulnerability-aware reweighting functions for assigning perturbation
bounds to adversarial examples used for AT, named Margin-Weighted Perturbation
Budget (MWPB) and Standard-Deviation-Weighted Perturbation Budget (SDWPB). The
proposed methods assign perturbation radii to individual adversarial samples
based on the vulnerability of their corresponding natural examples.
Experimental results show that the proposed methods yield genuine improvements
in the robustness of AT algorithms against various adversarial attacks.
Related papers
- Reversible Jump Attack to Textual Classifiers with Modification Reduction [8.247761405798874]
Reversible Jump Attack (RJA) and Metropolis-Hasting Modification Reduction (MMR) are proposed.
RJA-MMR outperforms current state-of-the-art methods in attack performance, imperceptibility, fluency and grammar correctness.
arXiv Detail & Related papers (2024-03-21T04:54:31Z) - Latent Feature Relation Consistency for Adversarial Robustness [80.24334635105829]
misclassification will occur when deep neural networks predict adversarial examples which add human-imperceptible adversarial noise to natural examples.
We propose textbfLatent textbfFeature textbfRelation textbfConsistency (textbfLFRC)
LFRC constrains the relation of adversarial examples in latent space to be consistent with the natural examples.
arXiv Detail & Related papers (2023-03-29T13:50:01Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Effective Targeted Attacks for Adversarial Self-Supervised Learning [58.14233572578723]
unsupervised adversarial training (AT) has been highlighted as a means of achieving robustness in models without any label information.
We propose a novel positive mining for targeted adversarial attack to generate effective adversaries for adversarial SSL frameworks.
Our method demonstrates significant enhancements in robustness when applied to non-contrastive SSL frameworks, and less but consistent robustness improvements with contrastive SSL frameworks.
arXiv Detail & Related papers (2022-10-19T11:43:39Z) - Improved and Interpretable Defense to Transferred Adversarial Examples
by Jacobian Norm with Selective Input Gradient Regularization [31.516568778193157]
Adversarial training (AT) is often adopted to improve the robustness of deep neural networks (DNNs)
In this work, we propose an approach based on Jacobian norm and Selective Input Gradient Regularization (J- SIGR)
Experiments demonstrate that the proposed J- SIGR confers improved robustness against transferred adversarial attacks, and we also show that the predictions from the neural network are easy to interpret.
arXiv Detail & Related papers (2022-07-09T01:06:41Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - A Unified Wasserstein Distributional Robustness Framework for
Adversarial Training [24.411703133156394]
This paper presents a unified framework that connects Wasserstein distributional robustness with current state-of-the-art AT methods.
We introduce a new Wasserstein cost function and a new series of risk functions, with which we show that standard AT methods are special cases of their counterparts in our framework.
This connection leads to an intuitive relaxation and generalization of existing AT methods and facilitates the development of a new family of distributional robustness AT-based algorithms.
arXiv Detail & Related papers (2022-02-27T19:40:29Z) - Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training [106.34722726264522]
A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise.
Pre-processing methods may suffer from the robustness degradation effect.
A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model.
We propose a method called Joint Adversarial Training based Pre-processing (JATP) defense.
arXiv Detail & Related papers (2021-06-10T01:45:32Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.