Push Stricter to Decide Better: A Class-Conditional Feature Adaptive
Framework for Improving Adversarial Robustness
- URL: http://arxiv.org/abs/2112.00323v1
- Date: Wed, 1 Dec 2021 07:37:56 GMT
- Title: Push Stricter to Decide Better: A Class-Conditional Feature Adaptive
Framework for Improving Adversarial Robustness
- Authors: Jia-Li Yin, Lehui Xie, Wanqing Zhu, Ximeng Liu, Bo-Hao Chen
- Abstract summary: We propose a Feature Adaptive Adversarial Training (FAAT) to optimize the class-conditional feature adaption across natural data and adversarial examples.
FAAT produces more discriminative features and performs favorably against state-of-the-art methods.
- Score: 18.98147977363969
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In response to the threat of adversarial examples, adversarial training
provides an attractive option for enhancing the model robustness by training
models on online-augmented adversarial examples. However, most of the existing
adversarial training methods focus on improving the robust accuracy by
strengthening the adversarial examples but neglecting the increasing shift
between natural data and adversarial examples, leading to a dramatic decrease
in natural accuracy. To maintain the trade-off between natural and robust
accuracy, we alleviate the shift from the perspective of feature adaption and
propose a Feature Adaptive Adversarial Training (FAAT) optimizing the
class-conditional feature adaption across natural data and adversarial
examples. Specifically, we propose to incorporate a class-conditional
discriminator to encourage the features become (1) class-discriminative and (2)
invariant to the change of adversarial attacks. The novel FAAT framework
enables the trade-off between natural and robust accuracy by generating
features with similar distribution across natural and adversarial data, and
achieve higher overall robustness benefited from the class-discriminative
feature characteristics. Experiments on various datasets demonstrate that FAAT
produces more discriminative features and performs favorably against
state-of-the-art methods. Codes are available at
https://github.com/VisionFlow/FAAT.
Related papers
- Enhancing Robust Representation in Adversarial Training: Alignment and
Exclusion Criteria [61.048842737581865]
We show that Adversarial Training (AT) omits to learning robust features, resulting in poor performance of adversarial robustness.
We propose a generic framework of AT to gain robust representation, by the asymmetric negative contrast and reverse attention.
Empirical evaluations on three benchmark datasets show our methods greatly advance the robustness of AT and achieve state-of-the-art performance.
arXiv Detail & Related papers (2023-10-05T07:29:29Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Improving Robust Fairness via Balance Adversarial Training [51.67643171193376]
Adversarial training (AT) methods are effective against adversarial attacks, yet they introduce severe disparity of accuracy and robustness between different classes.
We propose Adversarial Training (BAT) to address the robust fairness problem.
arXiv Detail & Related papers (2022-09-15T14:44:48Z) - Vanilla Feature Distillation for Improving the Accuracy-Robustness
Trade-Off in Adversarial Training [37.5115141623558]
We propose a Vanilla Feature Distillation Adversarial Training (VFD-Adv) to guide adversarial training towards higher accuracy.
A key advantage of our method is that it can be universally adapted to and boost existing works.
arXiv Detail & Related papers (2022-06-05T11:57:10Z) - Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance.
Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z) - Robustness through Cognitive Dissociation Mitigation in Contrastive
Adversarial Training [2.538209532048867]
We introduce a novel neural network training framework that increases model's adversarial robustness to adversarial attacks.
We propose to improve model robustness to adversarial attacks by learning feature representations consistent under both data augmentations and adversarial perturbations.
We validate our method on the CIFAR-10 dataset on which it outperforms both robust accuracy and clean accuracy over alternative supervised and self-supervised adversarial learning methods.
arXiv Detail & Related papers (2022-03-16T21:41:27Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - Class-Aware Domain Adaptation for Improving Adversarial Robustness [27.24720754239852]
adversarial training has been proposed to train networks by injecting adversarial examples into the training data.
We propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training.
arXiv Detail & Related papers (2020-05-10T03:45:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.