Adaptive Smoothness-weighted Adversarial Training for Multiple
Perturbations with Its Stability Analysis
- URL: http://arxiv.org/abs/2210.00557v1
- Date: Sun, 2 Oct 2022 15:42:34 GMT
- Title: Adaptive Smoothness-weighted Adversarial Training for Multiple
Perturbations with Its Stability Analysis
- Authors: Jiancong Xiao, Zeyu Qin, Yanbo Fan, Baoyuan Wu, Jue Wang, Zhi-Quan Luo
- Abstract summary: Adrial Training (AT) has been demonstrated as one of the most effective methods against adversarial examples.
Adrial training for multiple perturbations (ATMP) is proposed to generalize the adversarial robustness over different perturbation types.
We develop the stability-based excess risk bounds and propose adaptive-weighted adversarial training for multiple perturbations.
- Score: 39.90487314421744
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial Training (AT) has been demonstrated as one of the most effective
methods against adversarial examples. While most existing works focus on AT
with a single type of perturbation e.g., the $\ell_\infty$ attacks), DNNs are
facing threats from different types of adversarial examples. Therefore,
adversarial training for multiple perturbations (ATMP) is proposed to
generalize the adversarial robustness over different perturbation types (in
$\ell_1$, $\ell_2$, and $\ell_\infty$ norm-bounded perturbations). However, the
resulting model exhibits trade-off between different attacks. Meanwhile, there
is no theoretical analysis of ATMP, limiting its further development. In this
paper, we first provide the smoothness analysis of ATMP and show that $\ell_1$,
$\ell_2$, and $\ell_\infty$ adversaries give different contributions to the
smoothness of the loss function of ATMP. Based on this, we develop the
stability-based excess risk bounds and propose adaptive smoothness-weighted
adversarial training for multiple perturbations. Theoretically, our algorithm
yields better bounds. Empirically, our experiments on CIFAR10 and CIFAR100
achieve the state-of-the-art performance against the mixture of multiple
perturbations attacks.
Related papers
- Deep Adversarial Defense Against Multilevel-Lp Attacks [5.604868766260297]
This paper introduces a computationally efficient multilevel $ell_p$ defense, called the Efficient Robust Mode Connectivity (EMRC) method.
Similar to analytical continuation approaches used in continuous optimization, the method blends two $p$-specific adversarially optimal models.
We present experiments demonstrating that our approach performs better on various attacks as compared to AT-$ell_infty$, E-AT, and MSD.
arXiv Detail & Related papers (2024-07-12T13:30:00Z) - $σ$-zero: Gradient-based Optimization of $\ell_0$-norm Adversarial Examples [14.17412770504598]
We show that $ell_infty$-norm constraints can be used to craft input perturbations.
We propose a novel $ell_infty$-norm attack called $sigma$-norm.
It outperforms all competing adversarial attacks in terms of success, size, and efficiency.
arXiv Detail & Related papers (2024-02-02T20:08:11Z) - Parameter-Saving Adversarial Training: Reinforcing Multi-Perturbation
Robustness via Hypernetworks [47.21491911505409]
Adrial training serves as one of the most popular and effective methods to defend against adversarial perturbations.
We propose a novel multi-perturbation adversarial training framework, parameter-saving adversarial training (PSAT), to reinforce multi-perturbation robustness.
arXiv Detail & Related papers (2023-09-28T07:16:02Z) - A Practical Upper Bound for the Worst-Case Attribution Deviations [21.341303776931532]
Model attribution is a critical component of deep neural networks (DNNs) for its interpretability to complex models.
Recent studies bring up attention to the security of attribution methods as they are vulnerable to attribution attacks that generate similar images with dramatically different attributions.
Existing works have been investigating empirically improving the robustness of DNNs against those attacks; however, none of them explicitly quantifies the actual deviations of attributions.
In this work, for the first time, a constrained optimization problem is formulated to derive an upper bound that measures the largest dissimilarity of attributions after the samples are perturbed by any noises within a certain region
arXiv Detail & Related papers (2023-03-01T09:07:27Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Towards Defending Multiple $\ell_p$-norm Bounded Adversarial
Perturbations via Gated Batch Normalization [120.99395850108422]
Existing adversarial defenses typically improve model robustness against individual specific perturbations.
Some recent methods improve model robustness against adversarial attacks in multiple $ell_p$ balls, but their performance against each perturbation type is still far from satisfactory.
We propose Gated Batch Normalization (GBN) to adversarially train a perturbation-invariant predictor for defending multiple $ell_p bounded adversarial perturbations.
arXiv Detail & Related papers (2020-12-03T02:26:01Z) - Learning to Generate Noise for Multi-Attack Robustness [126.23656251512762]
Adversarial learning has emerged as one of the successful techniques to circumvent the susceptibility of existing methods against adversarial perturbations.
In safety-critical applications, this makes these methods extraneous as the attacker can adopt diverse adversaries to deceive the system.
We propose a novel meta-learning framework that explicitly learns to generate noise to improve the model's robustness against multiple types of attacks.
arXiv Detail & Related papers (2020-06-22T10:44:05Z) - Toward Adversarial Robustness via Semi-supervised Robust Training [93.36310070269643]
Adrial examples have been shown to be the severe threat to deep neural networks (DNNs)
We propose a novel defense method, the robust training (RT), by jointly minimizing two separated risks ($R_stand$ and $R_rob$)
arXiv Detail & Related papers (2020-03-16T02:14:08Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.