On the Onset of Robust Overfitting in Adversarial Training
- URL: http://arxiv.org/abs/2310.00607v1
- Date: Sun, 1 Oct 2023 07:57:03 GMT
- Title: On the Onset of Robust Overfitting in Adversarial Training
- Authors: Chaojian Yu, Xiaolong Shi, Jun Yu, Bo Han, Tongliang Liu
- Abstract summary: Adversarial Training (AT) is a widely-used algorithm for building robust neural networks.
AT suffers from the issue of robust overfitting, the fundamental mechanism of which remains unclear.
- Score: 66.27055915739331
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Adversarial Training (AT) is a widely-used algorithm for building robust
neural networks, but it suffers from the issue of robust overfitting, the
fundamental mechanism of which remains unclear. In this work, we consider
normal data and adversarial perturbation as separate factors, and identify that
the underlying causes of robust overfitting stem from the normal data through
factor ablation in AT. Furthermore, we explain the onset of robust overfitting
as a result of the model learning features that lack robust generalization,
which we refer to as non-effective features. Specifically, we provide a
detailed analysis of the generation of non-effective features and how they lead
to robust overfitting. Additionally, we explain various empirical behaviors
observed in robust overfitting and revisit different techniques to mitigate
robust overfitting from the perspective of non-effective features, providing a
comprehensive understanding of the robust overfitting phenomenon. This
understanding inspires us to propose two measures, attack strength and data
augmentation, to hinder the learning of non-effective features by the neural
network, thereby alleviating robust overfitting. Extensive experiments
conducted on benchmark datasets demonstrate the effectiveness of the proposed
methods in mitigating robust overfitting and enhancing adversarial robustness.
Related papers
- Understanding Adversarially Robust Generalization via Weight-Curvature Index [3.096869664709865]
We propose a novel perspective to decipher adversarially robust generalization through the lens of the Weight-Curvature Index (WCI)
The proposed WCI quantifies the vulnerability of models to adversarial perturbations using the Frobenius norm of weight matrices and the trace of Hessian matrices.
Our work provides crucial insights for designing more resilient deep learning models, enhancing their reliability and security.
arXiv Detail & Related papers (2024-10-10T08:34:43Z) - Mitigating Feature Gap for Adversarial Robustness by Feature
Disentanglement [61.048842737581865]
Adversarial fine-tuning methods aim to enhance adversarial robustness through fine-tuning the naturally pre-trained model in an adversarial training manner.
We propose a disentanglement-based approach to explicitly model and remove the latent features that cause the feature gap.
Empirical evaluations on three benchmark datasets demonstrate that our approach surpasses existing adversarial fine-tuning methods and adversarial training baselines.
arXiv Detail & Related papers (2024-01-26T08:38:57Z) - The Risk of Federated Learning to Skew Fine-Tuning Features and
Underperform Out-of-Distribution Robustness [50.52507648690234]
Federated learning has the risk of skewing fine-tuning features and compromising the robustness of the model.
We introduce three robustness indicators and conduct experiments across diverse robust datasets.
Our approach markedly enhances the robustness across diverse scenarios, encompassing various parameter-efficient fine-tuning methods.
arXiv Detail & Related papers (2024-01-25T09:18:51Z) - Quantifying the robustness of deep multispectral segmentation models
against natural perturbations and data poisoning [0.0]
We characterize the performance and robustness of a multispectral (RGB and near infrared) image segmentation model subjected to adversarial attacks and natural perturbations.
We find both RGB and multispectral models are vulnerable to data poisoning attacks regardless of input or fusion architectures.
arXiv Detail & Related papers (2023-05-18T23:43:33Z) - On Pitfalls of $\textit{RemOve-And-Retrain}$: Data Processing Inequality
Perspective [5.8010446129208155]
This study scrutinizes the dependability of the RemOve-And-Retrain (ROAR) procedure, which is prevalently employed for gauging the performance of feature importance estimates.
The insights gleaned from our theoretical foundation and empirical investigations reveal that attributions containing lesser information about the decision function may yield superior results in ROAR benchmarks.
arXiv Detail & Related papers (2023-04-26T21:43:42Z) - Demystifying Causal Features on Adversarial Examples and Causal
Inoculation for Robust Network by Adversarial Instrumental Variable
Regression [32.727673706238086]
We propose a way of delving into the unexpected vulnerability in adversarially trained networks from a causal perspective.
By deploying it, we estimate the causal relation of adversarial prediction under an unbiased environment.
We demonstrate that the estimated causal features are highly related to the correct prediction for adversarial robustness.
arXiv Detail & Related papers (2023-03-02T08:18:22Z) - Improving Adversarial Robustness via Mutual Information Estimation [144.33170440878519]
Deep neural networks (DNNs) are found to be vulnerable to adversarial noise.
In this paper, we investigate the dependence between outputs of the target model and input adversarial samples from the perspective of information theory.
We propose to enhance the adversarial robustness by maximizing the natural MI and minimizing the adversarial MI during the training process.
arXiv Detail & Related papers (2022-07-25T13:45:11Z) - Generalizable Information Theoretic Causal Representation [37.54158138447033]
We propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph.
The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability.
arXiv Detail & Related papers (2022-02-17T00:38:35Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Towards Unbiased Visual Emotion Recognition via Causal Intervention [63.74095927462]
We propose a novel Emotion Recognition Network (IERN) to alleviate the negative effects brought by the dataset bias.
A series of designed tests validate the effectiveness of IERN, and experiments on three emotion benchmarks demonstrate that IERN outperforms other state-of-the-art approaches.
arXiv Detail & Related papers (2021-07-26T10:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.