Finding Dynamics Preserving Adversarial Winning Tickets
- URL: http://arxiv.org/abs/2202.06488v2
- Date: Wed, 16 Feb 2022 16:35:21 GMT
- Title: Finding Dynamics Preserving Adversarial Winning Tickets
- Authors: Xupeng Shi, Pengfei Zheng, A. Adam Ding, Yuan Gao, Weizhong Zhang
- Abstract summary: Pruning methods have been considered in adversarial context to reduce model capacity and improve adversarial robustness simultaneously in training.
Existing adversarial pruning methods generally mimic the classical pruning methods for natural training, which follow the three-stage 'training-pruning-fine-tuning' pipelines.
We show empirical evidences that AWT preserves the dynamics of adversarial training and achieve equal performance as dense adversarial training.
- Score: 11.05616199881368
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern deep neural networks (DNNs) are vulnerable to adversarial attacks and
adversarial training has been shown to be a promising method for improving the
adversarial robustness of DNNs. Pruning methods have been considered in
adversarial context to reduce model capacity and improve adversarial robustness
simultaneously in training. Existing adversarial pruning methods generally
mimic the classical pruning methods for natural training, which follow the
three-stage 'training-pruning-fine-tuning' pipelines. We observe that such
pruning methods do not necessarily preserve the dynamics of dense networks,
making it potentially hard to be fine-tuned to compensate the accuracy
degradation in pruning. Based on recent works of \textit{Neural Tangent Kernel}
(NTK), we systematically study the dynamics of adversarial training and prove
the existence of trainable sparse sub-network at initialization which can be
trained to be adversarial robust from scratch. This theoretically verifies the
\textit{lottery ticket hypothesis} in adversarial context and we refer such
sub-network structure as \textit{Adversarial Winning Ticket} (AWT). We also
show empirical evidences that AWT preserves the dynamics of adversarial
training and achieve equal performance as dense adversarial training.
Related papers
- Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data [38.44734564565478]
We provide a theoretical understanding of adversarial examples and adversarial training algorithms from the perspective of feature learning theory.
We show that the adversarial training method can provably strengthen the robust feature learning and suppress the non-robust feature learning.
arXiv Detail & Related papers (2024-10-11T03:59:49Z) - Addressing Mistake Severity in Neural Networks with Semantic Knowledge [0.0]
Most robust training techniques aim to improve model accuracy on perturbed inputs.
As an alternate form of robustness, we aim to reduce the severity of mistakes made by neural networks in challenging conditions.
We leverage current adversarial training methods to generate targeted adversarial attacks during the training process.
Results demonstrate that our approach performs better with respect to mistake severity compared to standard and adversarially trained models.
arXiv Detail & Related papers (2022-11-21T22:01:36Z) - Improved and Interpretable Defense to Transferred Adversarial Examples
by Jacobian Norm with Selective Input Gradient Regularization [31.516568778193157]
Adversarial training (AT) is often adopted to improve the robustness of deep neural networks (DNNs)
In this work, we propose an approach based on Jacobian norm and Selective Input Gradient Regularization (J- SIGR)
Experiments demonstrate that the proposed J- SIGR confers improved robustness against transferred adversarial attacks, and we also show that the predictions from the neural network are easy to interpret.
arXiv Detail & Related papers (2022-07-09T01:06:41Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance.
Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z) - Defensive Tensorization [113.96183766922393]
We propose tensor defensiveization, an adversarial defence technique that leverages a latent high-order factorization of the network.
We empirically demonstrate the effectiveness of our approach on standard image classification benchmarks.
We validate the versatility of our approach across domains and low-precision architectures by considering an audio task and binary networks.
arXiv Detail & Related papers (2021-10-26T17:00:16Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - Improving adversarial robustness of deep neural networks by using
semantic information [17.887586209038968]
Adrial training is the main method for improving adversarial robustness and the first line of defense against adversarial attacks.
This paper provides a new perspective on the issue of adversarial robustness, one that shifts the focus from the network as a whole to the critical part of the region close to the decision boundary corresponding to a given class.
Experimental results on the MNIST and CIFAR-10 datasets show that this approach greatly improves adversarial robustness even using a very small dataset from the training data.
arXiv Detail & Related papers (2020-08-18T10:23:57Z) - Optimizing Information Loss Towards Robust Neural Networks [0.0]
Neural Networks (NNs) are vulnerable to adversarial examples.
We present a new training approach we call textitentropic retraining.
Based on an information-theoretic-inspired analysis, entropic retraining mimics the effects of adversarial training without the need of the laborious generation of adversarial examples.
arXiv Detail & Related papers (2020-08-07T10:12:31Z) - HYDRA: Pruning Adversarially Robust Neural Networks [58.061681100058316]
Deep learning faces two key challenges: lack of robustness against adversarial attacks and large neural network size.
We propose to make pruning techniques aware of the robust training objective and let the training objective guide the search for which connections to prune.
We demonstrate that our approach, titled HYDRA, achieves compressed networks with state-of-the-art benign and robust accuracy, simultaneously.
arXiv Detail & Related papers (2020-02-24T19:54:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.