Latent Boundary-guided Adversarial Training
- URL: http://arxiv.org/abs/2206.03717v1
- Date: Wed, 8 Jun 2022 07:40:55 GMT
- Title: Latent Boundary-guided Adversarial Training
- Authors: Xiaowei Zhou and Ivor W. Tsang and Jie Yin
- Abstract summary: Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
- Score: 61.43040235982727
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Neural Networks (DNNs) have recently achieved great success in many
classification tasks. Unfortunately, they are vulnerable to adversarial attacks
that generate adversarial examples with a small perturbation to fool DNN
models, especially in model sharing scenarios. Adversarial training is proved
to be the most effective strategy that injects adversarial examples into model
training to improve the robustness of DNN models to adversarial attacks.
However, adversarial training based on the existing adversarial examples fails
to generalize well to standard, unperturbed test data. To achieve a better
trade-off between standard accuracy and adversarial robustness, we propose a
novel adversarial training framework called LAtent bounDary-guided aDvErsarial
tRaining (LADDER) that adversarially trains DNN models on latent
boundary-guided adversarial examples. As opposed to most of the existing
methods that generate adversarial examples in the input space, LADDER generates
a myriad of high-quality adversarial examples through adding perturbations to
latent features. The perturbations are made along the normal of the decision
boundary constructed by an SVM with an attention mechanism. We analyze the
merits of our generated boundary-guided adversarial examples from a boundary
field perspective and visualization view. Extensive experiments and detailed
analysis on MNIST, SVHN, CelebA, and CIFAR-10 validate the effectiveness of
LADDER in achieving a better trade-off between standard accuracy and
adversarial robustness as compared with vanilla DNNs and competitive baselines.
Related papers
- Enhancing Adversarial Robustness via Uncertainty-Aware Distributional Adversarial Training [43.766504246864045]
We propose a novel uncertainty-aware distributional adversarial training method.
Our approach achieves state-of-the-art adversarial robustness and maintains natural performance.
arXiv Detail & Related papers (2024-11-05T07:26:24Z) - Perturbation-Invariant Adversarial Training for Neural Ranking Models:
Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents.
This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs.
In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z) - Effective Targeted Attacks for Adversarial Self-Supervised Learning [58.14233572578723]
unsupervised adversarial training (AT) has been highlighted as a means of achieving robustness in models without any label information.
We propose a novel positive mining for targeted adversarial attack to generate effective adversaries for adversarial SSL frameworks.
Our method demonstrates significant enhancements in robustness when applied to non-contrastive SSL frameworks, and less but consistent robustness improvements with contrastive SSL frameworks.
arXiv Detail & Related papers (2022-10-19T11:43:39Z) - Improved and Interpretable Defense to Transferred Adversarial Examples
by Jacobian Norm with Selective Input Gradient Regularization [31.516568778193157]
Adversarial training (AT) is often adopted to improve the robustness of deep neural networks (DNNs)
In this work, we propose an approach based on Jacobian norm and Selective Input Gradient Regularization (J- SIGR)
Experiments demonstrate that the proposed J- SIGR confers improved robustness against transferred adversarial attacks, and we also show that the predictions from the neural network are easy to interpret.
arXiv Detail & Related papers (2022-07-09T01:06:41Z) - Self-Ensemble Adversarial Training for Improved Robustness [14.244311026737666]
Adversarial training is the strongest strategy against various adversarial attacks among all sorts of defense methods.
Recent works mainly focus on developing new loss functions or regularizers, attempting to find the unique optimal point in the weight space.
We devise a simple but powerful emphSelf-Ensemble Adversarial Training (SEAT) method for yielding a robust classifier by averaging weights of history models.
arXiv Detail & Related papers (2022-03-18T01:12:18Z) - On the Convergence and Robustness of Adversarial Training [134.25999006326916]
Adrial training with Project Gradient Decent (PGD) is amongst the most effective.
We propose a textitdynamic training strategy to increase the convergence quality of the generated adversarial examples.
Our theoretical and empirical results show the effectiveness of the proposed method.
arXiv Detail & Related papers (2021-12-15T17:54:08Z) - Knowledge Enhanced Machine Learning Pipeline against Diverse Adversarial
Attacks [10.913817907524454]
We propose a Knowledge Enhanced Machine Learning Pipeline (KEMLP) to integrate domain knowledge into a graphical model.
In particular, we develop KEMLP by integrating a diverse set of weak auxiliary models based on their logical relationships to the main DNN model.
We show that compared with adversarial training and other baselines, KEMLP achieves higher robustness against physical attacks, $mathcalL_p$ bounded attacks, unforeseen attacks, and natural corruptions.
arXiv Detail & Related papers (2021-06-11T08:37:53Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.