Investigating Catastrophic Overfitting in Fast Adversarial Training: A
Self-fitting Perspective
- URL: http://arxiv.org/abs/2302.11963v2
- Date: Fri, 24 Mar 2023 13:40:27 GMT
- Title: Investigating Catastrophic Overfitting in Fast Adversarial Training: A
Self-fitting Perspective
- Authors: Zhengbao He, Tao Li, Sizhe Chen and Xiaolin Huang
- Abstract summary: We decouple single-step adversarial examples into data-information and self-information, which reveals an interesting phenomenon called "self-fitting"
When self-fitting occurs, the network experiences an obvious "channel differentiation" phenomenon that some convolution channels accounting for recognizing self-information become dominant, while others for data-information are suppressed.
Our findings reveal a self-learning mechanism in adversarial training and open up new perspectives for suppressing different kinds of information to mitigate CO.
- Score: 17.59014650714359
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although fast adversarial training provides an efficient approach for
building robust networks, it may suffer from a serious problem known as
catastrophic overfitting (CO), where multi-step robust accuracy suddenly
collapses to zero. In this paper, we for the first time decouple single-step
adversarial examples into data-information and self-information, which reveals
an interesting phenomenon called "self-fitting". Self-fitting, i.e., the
network learns the self-information embedded in single-step perturbations,
naturally leads to the occurrence of CO. When self-fitting occurs, the network
experiences an obvious "channel differentiation" phenomenon that some
convolution channels accounting for recognizing self-information become
dominant, while others for data-information are suppressed. In this way, the
network can only recognize images with sufficient self-information and loses
generalization ability to other types of data. Based on self-fitting, we
provide new insights into the existing methods to mitigate CO and extend CO to
multi-step adversarial training. Our findings reveal a self-learning mechanism
in adversarial training and open up new perspectives for suppressing different
kinds of information to mitigate CO.
Related papers
- Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data [38.44734564565478]
We provide a theoretical understanding of adversarial examples and adversarial training algorithms from the perspective of feature learning theory.
We show that the adversarial training method can provably strengthen the robust feature learning and suppress the non-robust feature learning.
arXiv Detail & Related papers (2024-10-11T03:59:49Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Critical Learning Periods for Multisensory Integration in Deep Networks [112.40005682521638]
We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training.
We show that critical periods arise from the complex and unstable early transient dynamics, which are decisive of final performance of the trained system and their learned representations.
arXiv Detail & Related papers (2022-10-06T23:50:38Z) - RelaxLoss: Defending Membership Inference Attacks without Losing Utility [68.48117818874155]
We propose a novel training framework based on a relaxed loss with a more achievable learning target.
RelaxLoss is applicable to any classification model with added benefits of easy implementation and negligible overhead.
Our approach consistently outperforms state-of-the-art defense mechanisms in terms of resilience against MIAs.
arXiv Detail & Related papers (2022-07-12T19:34:47Z) - Exploring Adversarial Examples and Adversarial Robustness of
Convolutional Neural Networks by Mutual Information [44.841339443764696]
This work investigates similarities and differences between two types of convolutional neural networks (CNNs) in information extraction.
The reason why adversarial examples mislead CNNs may be that they contain more texture-based information about other categories.
Normally trained CNNs tend to extract texture-based information from the inputs, while adversarially trained models prefer to shape-based information.
arXiv Detail & Related papers (2022-07-12T13:25:42Z) - Monitoring Shortcut Learning using Mutual Information [16.17600110257266]
Shortcut learning is evaluated on real-world data that does not contain spurious correlations.
Experiments demonstrate that MI can be used as a metric network shortcut network.
arXiv Detail & Related papers (2022-06-27T03:55:23Z) - Catastrophic overfitting can be induced with discriminative non-robust
features [95.07189577345059]
We study the onset of CO in single-step AT methods through controlled modifications of typical datasets of natural images.
We show that CO can be induced at much smaller $epsilon$ values than it was observed before just by injecting images with seemingly innocuous features.
arXiv Detail & Related papers (2022-06-16T15:22:39Z) - Certified Robustness in Federated Learning [54.03574895808258]
We study the interplay between federated training, personalization, and certified robustness.
We find that the simple federated averaging technique is effective in building not only more accurate, but also more certifiably-robust models.
arXiv Detail & Related papers (2022-06-06T12:10:53Z) - Reducing Catastrophic Forgetting in Self Organizing Maps with
Internally-Induced Generative Replay [67.50637511633212]
A lifelong learning agent is able to continually learn from potentially infinite streams of pattern sensory data.
One major historic difficulty in building agents that adapt is that neural systems struggle to retain previously-acquired knowledge when learning from new samples.
This problem is known as catastrophic forgetting (interference) and remains an unsolved problem in the domain of machine learning to this day.
arXiv Detail & Related papers (2021-12-09T07:11:14Z) - Self-Adaptive Training: Bridging the Supervised and Self-Supervised
Learning [16.765461276790944]
Self-adaptive training is a unified training algorithm that dynamically calibrates and enhances training process by model predictions without incurring extra computational cost.
We analyze the training dynamics of deep networks on training data corrupted by, e.g., random noise and adversarial examples.
Our analysis shows that model predictions are able to magnify useful underlying information in data and this phenomenon occurs broadly even in the absence of emphany label information.
arXiv Detail & Related papers (2021-01-21T17:17:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.