Understanding Robust Overfitting of Adversarial Training and Beyond
- URL: http://arxiv.org/abs/2206.08675v1
- Date: Fri, 17 Jun 2022 10:25:17 GMT
- Title: Understanding Robust Overfitting of Adversarial Training and Beyond
- Authors: Chaojian Yu, Bo Han, Li Shen, Jun Yu, Chen Gong, Mingming Gong,
Tongliang Liu
- Abstract summary: We show that robust overfitting widely exists in adversarial training of deep networks.
We propose emphminimum loss constrained adversarial training (MLCAT)
In a minibatch, we learn large-loss data as usual, and adopt additional measures to increase the loss of the small-loss data.
- Score: 103.37117541210348
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust overfitting widely exists in adversarial training of deep networks.
The exact underlying reasons for this are still not completely understood.
Here, we explore the causes of robust overfitting by comparing the data
distribution of \emph{non-overfit} (weak adversary) and \emph{overfitted}
(strong adversary) adversarial training, and observe that the distribution of
the adversarial data generated by weak adversary mainly contain small-loss
data. However, the adversarial data generated by strong adversary is more
diversely distributed on the large-loss data and the small-loss data. Given
these observations, we further designed data ablation adversarial training and
identify that some small-loss data which are not worthy of the adversary
strength cause robust overfitting in the strong adversary mode. To relieve this
issue, we propose \emph{minimum loss constrained adversarial training} (MLCAT):
in a minibatch, we learn large-loss data as usual, and adopt additional
measures to increase the loss of the small-loss data. Technically, MLCAT
hinders data fitting when they become easy to learn to prevent robust
overfitting; philosophically, MLCAT reflects the spirit of turning waste into
treasure and making the best use of each adversarial data; algorithmically, we
designed two realizations of MLCAT, and extensive experiments demonstrate that
MLCAT can eliminate robust overfitting and further boost adversarial
robustness.
Related papers
- Large-Scale Dataset Pruning in Adversarial Training through Data Importance Extrapolation [1.3124513975412255]
We propose a new data pruning strategy based on extrapolating data importance scores from a small set of data to a larger set.
In an empirical evaluation, we demonstrate that extrapolation-based pruning can efficiently reduce dataset size while maintaining robustness.
arXiv Detail & Related papers (2024-06-19T07:23:51Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Enhancing Multiple Reliability Measures via Nuisance-extended
Information Bottleneck [77.37409441129995]
In practical scenarios where training data is limited, many predictive signals in the data can be rather from some biases in data acquisition.
We consider an adversarial threat model under a mutual information constraint to cover a wider class of perturbations in training.
We propose an autoencoder-based training to implement the objective, as well as practical encoder designs to facilitate the proposed hybrid discriminative-generative training.
arXiv Detail & Related papers (2023-03-24T16:03:21Z) - Boundary Adversarial Examples Against Adversarial Overfitting [4.391102490444538]
adversarial training approaches suffer from robust overfitting where the robust accuracy decreases when models are adversarially trained for too long.
Several mitigation approaches including early stopping, temporal ensembling and weight memorizations have been proposed to mitigate the effect of robust overfitting.
In this paper, we investigate if these mitigation approaches are complimentary to each other in improving adversarial training performance.
arXiv Detail & Related papers (2022-11-25T13:16:53Z) - Data Profiling for Adversarial Training: On the Ruin of Problematic Data [27.11328449349065]
Problems in adversarial training include robustness-accuracy trade-off, robust overfitting, and gradient masking.
We show that these problems share one common cause -- low quality samples in the dataset.
We find that when problematic data is removed, robust overfitting and gradient masking can be largely alleviated.
arXiv Detail & Related papers (2021-02-15T10:17:24Z) - Curse or Redemption? How Data Heterogeneity Affects the Robustness of
Federated Learning [51.15273664903583]
Data heterogeneity has been identified as one of the key features in federated learning but often overlooked in the lens of robustness to adversarial attacks.
This paper focuses on characterizing and understanding its impact on backdooring attacks in federated learning through comprehensive experiments using synthetic and the LEAF benchmarks.
arXiv Detail & Related papers (2021-02-01T06:06:21Z) - Auto-weighted Robust Federated Learning with Corrupted Data Sources [7.475348174281237]
Federated learning provides a communication-efficient and privacy-preserving training process.
Standard federated learning techniques that naively minimize an average loss function are vulnerable to data corruptions.
We propose Auto-weighted Robust Federated Learning (arfl) to provide robustness against corrupted data sources.
arXiv Detail & Related papers (2021-01-14T21:54:55Z) - Stylized Adversarial Defense [105.88250594033053]
adversarial training creates perturbation patterns and includes them in the training set to robustify the model.
We propose to exploit additional information from the feature space to craft stronger adversaries.
Our adversarial training approach demonstrates strong robustness compared to state-of-the-art defenses.
arXiv Detail & Related papers (2020-07-29T08:38:10Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.