Towards the Memorization Effect of Neural Networks in Adversarial
Training
- URL: http://arxiv.org/abs/2106.04794v1
- Date: Wed, 9 Jun 2021 03:47:32 GMT
- Title: Towards the Memorization Effect of Neural Networks in Adversarial
Training
- Authors: Han Xu, Xiaorui Liu, Wentao Wang, Wenbiao Ding, Zhongqin Wu, Zitao
Liu, Anil Jain, Jiliang Tang
- Abstract summary: We propose Benign Adversarial Training (BAT) which can facilitate adversarial training to avoid fitting harmful'' atypical samples.
BAT can achieve better clean accuracy vs. robustness trade-off than baseline methods, in benchmark datasets such as CIFAR100 and TinyImageNet.
- Score: 36.35802928128805
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies suggest that ``memorization'' is one important factor for
overparameterized deep neural networks (DNNs) to achieve optimal performance.
Specifically, the perfectly fitted DNNs can memorize the labels of many
atypical samples, generalize their memorization to correctly classify test
atypical samples and enjoy better test performance. While, DNNs which are
optimized via adversarial training algorithms can also achieve perfect training
performance by memorizing the labels of atypical samples, as well as the
adversarially perturbed atypical samples. However, adversarially trained models
always suffer from poor generalization, with both relatively low clean accuracy
and robustness on the test set. In this work, we study the effect of
memorization in adversarial trained DNNs and disclose two important findings:
(a) Memorizing atypical samples is only effective to improve DNN's accuracy on
clean atypical samples, but hardly improve their adversarial robustness and (b)
Memorizing certain atypical samples will even hurt the DNN's performance on
typical samples. Based on these two findings, we propose Benign Adversarial
Training (BAT) which can facilitate adversarial training to avoid fitting
``harmful'' atypical samples and fit as more ``benign'' atypical samples as
possible. In our experiments, we validate the effectiveness of BAT, and show it
can achieve better clean accuracy vs. robustness trade-off than baseline
methods, in benchmark datasets such as CIFAR100 and Tiny~ImageNet.
Related papers
- Noisy Correspondence Learning with Self-Reinforcing Errors Mitigation [63.180725016463974]
Cross-modal retrieval relies on well-matched large-scale datasets that are laborious in practice.
We introduce a novel noisy correspondence learning framework, namely textbfSelf-textbfReinforcing textbfErrors textbfMitigation (SREM)
arXiv Detail & Related papers (2023-12-27T09:03:43Z) - How Low Can You Go? Surfacing Prototypical In-Distribution Samples for Unsupervised Anomaly Detection [48.30283806131551]
We show that UAD with extremely few training samples can already match -- and in some cases even surpass -- the performance of training with the whole training dataset.
We propose an unsupervised method to reliably identify prototypical samples to further boost UAD performance.
arXiv Detail & Related papers (2023-12-06T15:30:47Z) - Test-Time Distribution Normalization for Contrastively Learned
Vision-language Models [39.66329310098645]
One of the most representative approaches proposed recently known as CLIP has garnered widespread adoption due to its effectiveness.
This paper reveals that the common downstream practice of taking a dot product is only a zeroth-order approximation of the optimization goal, resulting in a loss of information during test-time.
We propose Distribution Normalization (DN), where we approximate the mean representation of a batch of test samples and use such a mean to represent what would be analogous to negative samples in the InfoNCE loss.
arXiv Detail & Related papers (2023-02-22T01:14:30Z) - Towards Robust Visual Question Answering: Making the Most of Biased
Samples via Contrastive Learning [54.61762276179205]
We propose a novel contrastive learning approach, MMBS, for building robust VQA models by Making the Most of Biased Samples.
Specifically, we construct positive samples for contrastive learning by eliminating the information related to spurious correlation from the original training samples.
We validate our contributions by achieving competitive performance on the OOD dataset VQA-CP v2 while preserving robust performance on the ID dataset VQA v2.
arXiv Detail & Related papers (2022-10-10T11:05:21Z) - CADet: Fully Self-Supervised Out-Of-Distribution Detection With
Contrastive Learning [11.897976063005315]
This work explores the use of self-supervised contrastive learning to the simultaneous detection of two types of OOD samples.
First, we pair self-supervised contrastive learning with the maximum mean discrepancy (MMD) two-sample test.
Motivated by this success, we introduce CADet, a novel method for OOD detection of single samples.
arXiv Detail & Related papers (2022-10-04T17:02:37Z) - ScatterSample: Diversified Label Sampling for Data Efficient Graph
Neural Network Learning [22.278779277115234]
In some applications where graph neural network (GNN) training is expensive, labeling new instances is expensive.
We develop a data-efficient active sampling framework, ScatterSample, to train GNNs under an active learning setting.
Our experiments on five datasets show that ScatterSample significantly outperforms the other GNN active learning baselines.
arXiv Detail & Related papers (2022-06-09T04:05:02Z) - Jo-SRC: A Contrastive Approach for Combating Noisy Labels [58.867237220886885]
We propose a noise-robust approach named Jo-SRC (Joint Sample Selection and Model Regularization based on Consistency)
Specifically, we train the network in a contrastive learning manner. Predictions from two different views of each sample are used to estimate its "likelihood" of being clean or out-of-distribution.
arXiv Detail & Related papers (2021-03-24T07:26:07Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.