Interpolated Joint Space Adversarial Training for Robust and
Generalizable Defenses
- URL: http://arxiv.org/abs/2112.06323v1
- Date: Sun, 12 Dec 2021 21:08:14 GMT
- Title: Interpolated Joint Space Adversarial Training for Robust and
Generalizable Defenses
- Authors: Chun Pong Lau, Jiang Liu, Hossein Souri, Wei-An Lin, Soheil Feizi,
Rama Chellappa
- Abstract summary: Adversarial training (AT) is considered to be one of the most reliable defenses against adversarial attacks.
Recent works show generalization improvement with adversarial samples under novel threat models.
We propose a novel threat model called Joint Space Threat Model (JSTM)
Under JSTM, we develop novel adversarial attacks and defenses.
- Score: 82.3052187788609
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial training (AT) is considered to be one of the most reliable
defenses against adversarial attacks. However, models trained with AT sacrifice
standard accuracy and do not generalize well to novel attacks. Recent works
show generalization improvement with adversarial samples under novel threat
models such as on-manifold threat model or neural perceptual threat model.
However, the former requires exact manifold information while the latter
requires algorithm relaxation. Motivated by these considerations, we exploit
the underlying manifold information with Normalizing Flow, ensuring that exact
manifold assumption holds. Moreover, we propose a novel threat model called
Joint Space Threat Model (JSTM), which can serve as a special case of the
neural perceptual threat model that does not require additional relaxation to
craft the corresponding adversarial attacks. Under JSTM, we develop novel
adversarial attacks and defenses. The mixup strategy improves the standard
accuracy of neural networks but sacrifices robustness when combined with AT. To
tackle this issue, we propose the Robust Mixup strategy in which we maximize
the adversity of the interpolated images and gain robustness and prevent
overfitting. Our experiments show that Interpolated Joint Space Adversarial
Training (IJSAT) achieves good performance in standard accuracy, robustness,
and generalization in CIFAR-10/100, OM-ImageNet, and CIFAR-10-C datasets. IJSAT
is also flexible and can be used as a data augmentation method to improve
standard accuracy and combine with many existing AT approaches to improve
robustness.
Related papers
- A Hybrid Defense Strategy for Boosting Adversarial Robustness in Vision-Language Models [9.304845676825584]
We propose a novel adversarial training framework that integrates multiple attack strategies and advanced machine learning techniques.
Experiments conducted on real-world datasets, including CIFAR-10 and CIFAR-100, demonstrate that the proposed method significantly enhances model robustness.
arXiv Detail & Related papers (2024-10-18T23:47:46Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - The Effectiveness of Random Forgetting for Robust Generalization [21.163070161951868]
We introduce a novel learning paradigm called "Forget to Mitigate Overfitting" (FOMO)
FOMO alternates between the forgetting phase, which randomly forgets a subset of weights, and the relearning phase, which emphasizes learning generalizable features.
Our experiments show that FOMO alleviates robust overfitting by significantly reducing the gap between the best and last robust test accuracy.
arXiv Detail & Related papers (2024-02-18T23:14:40Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Alleviating Robust Overfitting of Adversarial Training With Consistency
Regularization [9.686724616328874]
Adversarial training (AT) has proven to be one of the most effective ways to defend Deep Neural Networks (DNNs) against adversarial attacks.
robustness will drop sharply at a certain stage, always exists during AT.
consistency regularization, a popular technique in semi-supervised learning, has a similar goal as AT and can be used to alleviate robust overfitting.
arXiv Detail & Related papers (2022-05-24T03:18:43Z) - Self-Ensemble Adversarial Training for Improved Robustness [14.244311026737666]
Adversarial training is the strongest strategy against various adversarial attacks among all sorts of defense methods.
Recent works mainly focus on developing new loss functions or regularizers, attempting to find the unique optimal point in the weight space.
We devise a simple but powerful emphSelf-Ensemble Adversarial Training (SEAT) method for yielding a robust classifier by averaging weights of history models.
arXiv Detail & Related papers (2022-03-18T01:12:18Z) - Towards Compositional Adversarial Robustness: Generalizing Adversarial
Training to Composite Semantic Perturbations [70.05004034081377]
We first propose a novel method for generating composite adversarial examples.
Our method can find the optimal attack composition by utilizing component-wise projected gradient descent.
We then propose generalized adversarial training (GAT) to extend model robustness from $ell_p$-ball to composite semantic perturbations.
arXiv Detail & Related papers (2022-02-09T02:41:56Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Boosting Adversarial Training with Hypersphere Embedding [53.75693100495097]
Adversarial training is one of the most effective defenses against adversarial attacks for deep learning models.
In this work, we advocate incorporating the hypersphere embedding mechanism into the AT procedure.
We validate our methods under a wide range of adversarial attacks on the CIFAR-10 and ImageNet datasets.
arXiv Detail & Related papers (2020-02-20T08:42:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.