Parameter-Saving Adversarial Training: Reinforcing Multi-Perturbation
Robustness via Hypernetworks
- URL: http://arxiv.org/abs/2309.16207v1
- Date: Thu, 28 Sep 2023 07:16:02 GMT
- Title: Parameter-Saving Adversarial Training: Reinforcing Multi-Perturbation
Robustness via Hypernetworks
- Authors: Huihui Gong, Minjing Dong, Siqi Ma, Seyit Camtepe, Surya Nepal, Chang
Xu
- Abstract summary: Adrial training serves as one of the most popular and effective methods to defend against adversarial perturbations.
We propose a novel multi-perturbation adversarial training framework, parameter-saving adversarial training (PSAT), to reinforce multi-perturbation robustness.
- Score: 47.21491911505409
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training serves as one of the most popular and effective methods
to defend against adversarial perturbations. However, most defense mechanisms
only consider a single type of perturbation while various attack methods might
be adopted to perform stronger adversarial attacks against the deployed model
in real-world scenarios, e.g., $\ell_2$ or $\ell_\infty$. Defending against
various attacks can be a challenging problem since multi-perturbation
adversarial training and its variants only achieve suboptimal robustness
trade-offs, due to the theoretical limit to multi-perturbation robustness for a
single model. Besides, it is impractical to deploy large models in some
storage-efficient scenarios. To settle down these drawbacks, in this paper we
propose a novel multi-perturbation adversarial training framework,
parameter-saving adversarial training (PSAT), to reinforce multi-perturbation
robustness with an advantageous side effect of saving parameters, which
leverages hypernetworks to train specialized models against a single
perturbation and aggregate these specialized models to defend against multiple
perturbations. Eventually, we extensively evaluate and compare our proposed
method with state-of-the-art single/multi-perturbation robust methods against
various latest attack methods on different datasets, showing the robustness
superiority and parameter efficiency of our proposed method, e.g., for the
CIFAR-10 dataset with ResNet-50 as the backbone, PSAT saves approximately 80\%
of parameters with achieving the state-of-the-art robustness trade-off
accuracy.
Related papers
- Hyper Adversarial Tuning for Boosting Adversarial Robustness of Pretrained Large Vision Models [9.762046320216005]
Large vision models have been found vulnerable to adversarial examples, emphasizing the need for enhancing their adversarial robustness.
Recent approaches propose robust fine-tuning methods, such as adversarial tuning of low-rank adaptation (LoRA) in large vision models, but they still struggle to match the accuracy of full parameter adversarial fine-tuning.
We propose hyper adversarial tuning (HyperAT), which leverages shared defensive knowledge among different methods to improve model robustness efficiently and effectively simultaneously.
arXiv Detail & Related papers (2024-10-08T12:05:01Z) - Enhancing Targeted Attack Transferability via Diversified Weight Pruning [0.3222802562733786]
Malicious attackers can generate targeted adversarial examples by imposing human-imperceptible noise on images.
With cross-model transferable adversarial examples, the vulnerability of neural networks remains even if the model information is kept secret from the attacker.
Recent studies have shown the effectiveness of ensemble-based methods in generating transferable adversarial examples.
arXiv Detail & Related papers (2022-08-18T07:25:48Z) - Self-Ensemble Adversarial Training for Improved Robustness [14.244311026737666]
Adversarial training is the strongest strategy against various adversarial attacks among all sorts of defense methods.
Recent works mainly focus on developing new loss functions or regularizers, attempting to find the unique optimal point in the weight space.
We devise a simple but powerful emphSelf-Ensemble Adversarial Training (SEAT) method for yielding a robust classifier by averaging weights of history models.
arXiv Detail & Related papers (2022-03-18T01:12:18Z) - Multi-stage Optimization based Adversarial Training [16.295921205749934]
We propose a Multi-stage Optimization based Adversarial Training (MOAT) method that periodically trains the model on mixed benign examples.
Under similar amount of training overhead, the proposed MOAT exhibits better robustness than either single-step or multi-step adversarial training methods.
arXiv Detail & Related papers (2021-06-26T07:59:52Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - "What's in the box?!": Deflecting Adversarial Attacks by Randomly
Deploying Adversarially-Disjoint Models [71.91835408379602]
adversarial examples have been long considered a real threat to machine learning models.
We propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models.
arXiv Detail & Related papers (2021-02-09T20:07:13Z) - Self-Progressing Robust Training [146.8337017922058]
Current robust training methods such as adversarial training explicitly uses an "attack" to generate adversarial examples.
We propose a new framework called SPROUT, self-progressing robust training.
Our results shed new light on scalable, effective and attack-independent robust training methods.
arXiv Detail & Related papers (2020-12-22T00:45:24Z) - Learning to Generate Noise for Multi-Attack Robustness [126.23656251512762]
Adversarial learning has emerged as one of the successful techniques to circumvent the susceptibility of existing methods against adversarial perturbations.
In safety-critical applications, this makes these methods extraneous as the attacker can adopt diverse adversaries to deceive the system.
We propose a novel meta-learning framework that explicitly learns to generate noise to improve the model's robustness against multiple types of attacks.
arXiv Detail & Related papers (2020-06-22T10:44:05Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.