Beyond Empirical Risk Minimization: Local Structure Preserving
Regularization for Improving Adversarial Robustness
- URL: http://arxiv.org/abs/2303.16861v1
- Date: Wed, 29 Mar 2023 17:18:58 GMT
- Title: Beyond Empirical Risk Minimization: Local Structure Preserving
Regularization for Improving Adversarial Robustness
- Authors: Wei Wei, Jiahuan Zhou, Ying Wu
- Abstract summary: Local Structure Preserving (LSP) regularization aims to preserve the local structure of the input space in the learned embedding space.
In this work, we propose a novel Local Structure Preserving (LSP) regularization, which aims to preserve the local structure of the input space in the learned embedding space.
- Score: 28.853413482357634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is broadly known that deep neural networks are susceptible to being fooled
by adversarial examples with perturbations imperceptible by humans. Various
defenses have been proposed to improve adversarial robustness, among which
adversarial training methods are most effective. However, most of these methods
treat the training samples independently and demand a tremendous amount of
samples to train a robust network, while ignoring the latent structural
information among these samples. In this work, we propose a novel Local
Structure Preserving (LSP) regularization, which aims to preserve the local
structure of the input space in the learned embedding space. In this manner,
the attacking effect of adversarial samples lying in the vicinity of clean
samples can be alleviated. We show strong empirical evidence that with or
without adversarial training, our method consistently improves the performance
of adversarial robustness on several image classification datasets compared to
the baselines and some state-of-the-art approaches, thus providing promising
direction for future research.
Related papers
- Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data [38.44734564565478]
We provide a theoretical understanding of adversarial examples and adversarial training algorithms from the perspective of feature learning theory.
We show that the adversarial training method can provably strengthen the robust feature learning and suppress the non-robust feature learning.
arXiv Detail & Related papers (2024-10-11T03:59:49Z) - Mitigating Feature Gap for Adversarial Robustness by Feature
Disentanglement [61.048842737581865]
Adversarial fine-tuning methods aim to enhance adversarial robustness through fine-tuning the naturally pre-trained model in an adversarial training manner.
We propose a disentanglement-based approach to explicitly model and remove the latent features that cause the feature gap.
Empirical evaluations on three benchmark datasets demonstrate that our approach surpasses existing adversarial fine-tuning methods and adversarial training baselines.
arXiv Detail & Related papers (2024-01-26T08:38:57Z) - Improving Gradient-based Adversarial Training for Text Classification by
Contrastive Learning and Auto-Encoder [18.375585982984845]
We focus on enhancing the model's ability to defend gradient-based adversarial attack during the model's training process.
We propose two novel adversarial training approaches: CARL and RAR.
Experiments show that the proposed two approaches outperform strong baselines on various text classification datasets.
arXiv Detail & Related papers (2021-09-14T09:08:58Z) - Regional Adversarial Training for Better Robust Generalization [35.42873777434504]
We introduce a new adversarial training framework that considers the diversity as well as characteristics of the perturbed points in the vicinity of benign samples.
RAT consistently makes significant improvement on standard adversarial training (SAT), and exhibits better robust generalization.
arXiv Detail & Related papers (2021-09-02T02:48:02Z) - Searching for an Effective Defender: Benchmarking Defense against
Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples.
Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z) - Residual Error: a New Performance Measure for Adversarial Robustness [85.0371352689919]
A major challenge that limits the wide-spread adoption of deep learning has been their fragility to adversarial attacks.
This study presents the concept of residual error, a new performance measure for assessing the adversarial robustness of a deep neural network.
Experimental results using the case of image classification demonstrate the effectiveness and efficacy of the proposed residual error metric.
arXiv Detail & Related papers (2021-06-18T16:34:23Z) - Improving White-box Robustness of Pre-processing Defenses via Joint Adversarial Training [106.34722726264522]
A range of adversarial defense techniques have been proposed to mitigate the interference of adversarial noise.
Pre-processing methods may suffer from the robustness degradation effect.
A potential cause of this negative effect is that adversarial training examples are static and independent to the pre-processing model.
We propose a method called Joint Adversarial Training based Pre-processing (JATP) defense.
arXiv Detail & Related papers (2021-06-10T01:45:32Z) - An Effective Baseline for Robustness to Distributional Shift [5.627346969563955]
Refraining from confidently predicting when faced with categories of inputs different from those seen during training is an important requirement for the safe deployment of deep learning systems.
We present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention.
arXiv Detail & Related papers (2021-05-15T00:46:11Z) - Stylized Adversarial Defense [105.88250594033053]
adversarial training creates perturbation patterns and includes them in the training set to robustify the model.
We propose to exploit additional information from the feature space to craft stronger adversaries.
Our adversarial training approach demonstrates strong robustness compared to state-of-the-art defenses.
arXiv Detail & Related papers (2020-07-29T08:38:10Z) - Improving Adversarial Robustness by Enforcing Local and Global
Compactness [19.8818435601131]
Adversary training is the most successful method that consistently resists a wide range of attacks.
We propose the Adversary Divergence Reduction Network which enforces local/global compactness and the clustering assumption.
The experimental results demonstrate that augmenting adversarial training with our proposed components can further improve the robustness of the network.
arXiv Detail & Related papers (2020-07-10T00:43:06Z) - Adversarial Distributional Training for Robust Deep Learning [53.300984501078126]
Adversarial training (AT) is among the most effective techniques to improve model robustness by augmenting training data with adversarial examples.
Most existing AT methods adopt a specific attack to craft adversarial examples, leading to the unreliable robustness against other unseen attacks.
In this paper, we introduce adversarial distributional training (ADT), a novel framework for learning robust models.
arXiv Detail & Related papers (2020-02-14T12:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.