Semantics-Preserving Adversarial Training
- URL: http://arxiv.org/abs/2009.10978v1
- Date: Wed, 23 Sep 2020 07:42:14 GMT
- Title: Semantics-Preserving Adversarial Training
- Authors: Wonseok Lee, Hanbit Lee, Sang-goo Lee
- Abstract summary: Adversarial training is a technique that improves adversarial robustness of a deep neural network (DNN) by including adversarial examples in the training data.
We propose semantics-preserving adversarial training (SPAT) which encourages perturbation on the pixels that are shared among all classes.
Experiment results show that SPAT improves adversarial robustness and achieves state-of-the-art results in CIFAR-10 and CIFAR-100.
- Score: 12.242659601882147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training is a defense technique that improves adversarial
robustness of a deep neural network (DNN) by including adversarial examples in
the training data. In this paper, we identify an overlooked problem of
adversarial training in that these adversarial examples often have different
semantics than the original data, introducing unintended biases into the model.
We hypothesize that such non-semantics-preserving (and resultingly ambiguous)
adversarial data harm the robustness of the target models. To mitigate such
unintended semantic changes of adversarial examples, we propose
semantics-preserving adversarial training (SPAT) which encourages perturbation
on the pixels that are shared among all classes when generating adversarial
examples in the training stage. Experiment results show that SPAT improves
adversarial robustness and achieves state-of-the-art results in CIFAR-10 and
CIFAR-100.
Related papers
- Protecting Feed-Forward Networks from Adversarial Attacks Using Predictive Coding [0.20718016474717196]
An adversarial example is a modified input image designed to cause a Machine Learning (ML) model to make a mistake.
This study presents a practical and effective solution -- using predictive coding networks (PCnets) as an auxiliary step for adversarial defence.
arXiv Detail & Related papers (2024-10-31T21:38:05Z) - Improved Adversarial Training Through Adaptive Instance-wise Loss
Smoothing [5.1024659285813785]
Adversarial training has been the most successful defense against such adversarial attacks.
We propose a new adversarial training method: Instance-adaptive Smoothness Enhanced Adversarial Training.
Our method achieves state-of-the-art robustness against $ell_infty$-norm constrained attacks.
arXiv Detail & Related papers (2023-03-24T15:41:40Z) - On the Effect of Adversarial Training Against Invariance-based
Adversarial Examples [0.23624125155742057]
This work addresses the impact of adversarial training with invariance-based adversarial examples on a convolutional neural network (CNN)
We show that when adversarial training with invariance-based and perturbation-based adversarial examples is applied, it should be conducted simultaneously and not consecutively.
arXiv Detail & Related papers (2023-02-16T12:35:37Z) - The Enemy of My Enemy is My Friend: Exploring Inverse Adversaries for
Improving Adversarial Training [72.39526433794707]
Adversarial training and its variants have been shown to be the most effective approaches to defend against adversarial examples.
We propose a novel adversarial training scheme that encourages the model to produce similar outputs for an adversarial example and its inverse adversarial'' counterpart.
Our training method achieves state-of-the-art robustness as well as natural accuracy.
arXiv Detail & Related papers (2022-11-01T15:24:26Z) - Adversarial Pretraining of Self-Supervised Deep Networks: Past, Present
and Future [132.34745793391303]
We review adversarial pretraining of self-supervised deep networks including both convolutional neural networks and vision transformers.
To incorporate adversaries into pretraining models on either input or feature level, we find that existing approaches are largely categorized into two groups.
arXiv Detail & Related papers (2022-10-23T13:14:06Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance.
Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z) - Robustness through Cognitive Dissociation Mitigation in Contrastive
Adversarial Training [2.538209532048867]
We introduce a novel neural network training framework that increases model's adversarial robustness to adversarial attacks.
We propose to improve model robustness to adversarial attacks by learning feature representations consistent under both data augmentations and adversarial perturbations.
We validate our method on the CIFAR-10 dataset on which it outperforms both robust accuracy and clean accuracy over alternative supervised and self-supervised adversarial learning methods.
arXiv Detail & Related papers (2022-03-16T21:41:27Z) - Improving adversarial robustness of deep neural networks by using
semantic information [17.887586209038968]
Adrial training is the main method for improving adversarial robustness and the first line of defense against adversarial attacks.
This paper provides a new perspective on the issue of adversarial robustness, one that shifts the focus from the network as a whole to the critical part of the region close to the decision boundary corresponding to a given class.
Experimental results on the MNIST and CIFAR-10 datasets show that this approach greatly improves adversarial robustness even using a very small dataset from the training data.
arXiv Detail & Related papers (2020-08-18T10:23:57Z) - Stylized Adversarial Defense [105.88250594033053]
adversarial training creates perturbation patterns and includes them in the training set to robustify the model.
We propose to exploit additional information from the feature space to craft stronger adversaries.
Our adversarial training approach demonstrates strong robustness compared to state-of-the-art defenses.
arXiv Detail & Related papers (2020-07-29T08:38:10Z) - Dynamic Divide-and-Conquer Adversarial Training for Robust Semantic
Segmentation [79.42338812621874]
Adversarial training is promising for improving robustness of deep neural networks towards adversarial perturbations.
We formulate a general adversarial training procedure that can perform decently on both adversarial and clean samples.
We propose a dynamic divide-and-conquer adversarial training (DDC-AT) strategy to enhance the defense effect.
arXiv Detail & Related papers (2020-03-14T05:06:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.