Adversarial Amendment is the Only Force Capable of Transforming an Enemy
into a Friend
- URL: http://arxiv.org/abs/2305.10766v1
- Date: Thu, 18 May 2023 07:13:02 GMT
- Title: Adversarial Amendment is the Only Force Capable of Transforming an Enemy
into a Friend
- Authors: Chong Yu, Tao Chen, Zhongxue Gan
- Abstract summary: Adversarial attack is commonly regarded as a huge threat to neural networks because of misleading behavior.
This paper presents an opposite perspective: adversarial attacks can be harnessed to improve neural models if amended correctly.
- Score: 29.172689524555015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attack is commonly regarded as a huge threat to neural networks
because of misleading behavior. This paper presents an opposite perspective:
adversarial attacks can be harnessed to improve neural models if amended
correctly. Unlike traditional adversarial defense or adversarial training
schemes that aim to improve the adversarial robustness, the proposed
adversarial amendment (AdvAmd) method aims to improve the original accuracy
level of neural models on benign samples. We thoroughly analyze the
distribution mismatch between the benign and adversarial samples. This
distribution mismatch and the mutual learning mechanism with the same learning
ratio applied in prior art defense strategies is the main cause leading the
accuracy degradation for benign samples. The proposed AdvAmd is demonstrated to
steadily heal the accuracy degradation and even leads to a certain accuracy
boost of common neural models on benign classification, object detection, and
segmentation tasks. The efficacy of the AdvAmd is contributed by three key
components: mediate samples (to reduce the influence of distribution mismatch
with a fine-grained amendment), auxiliary batch norm (to solve the mutual
learning mechanism and the smoother judgment surface), and AdvAmd loss (to
adjust the learning ratios according to different attack vulnerabilities)
through quantitative and ablation experiments.
Related papers
- Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off [7.202931445597171]
Ad adversarial training has been the state-of-the-art approach to defend against adversarial examples (AEs)
It suffers from a robustness-accuracy trade-off, where high robustness is achieved at the cost of clean accuracy.
Our method significantly improves the robustness-accuracy trade-off by learning adversarially invariant representations without sacrificing discriminative ability.
arXiv Detail & Related papers (2024-02-22T15:53:46Z) - Perturbation-Invariant Adversarial Training for Neural Ranking Models:
Improving the Effectiveness-Robustness Trade-Off [107.35833747750446]
adversarial examples can be crafted by adding imperceptible perturbations to legitimate documents.
This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs.
In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs.
arXiv Detail & Related papers (2023-12-16T05:38:39Z) - Advancing Adversarial Robustness Through Adversarial Logit Update [10.041289551532804]
Adversarial training and adversarial purification are among the most widely recognized defense strategies.
We propose a new principle, namely Adversarial Logit Update (ALU), to infer adversarial sample's labels.
Our solution achieves superior performance compared to state-of-the-art methods against a wide range of adversarial attacks.
arXiv Detail & Related papers (2023-08-29T07:13:31Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Enhancing Adversarial Training with Feature Separability [52.39305978984573]
We introduce a new concept of adversarial training graph (ATG) with which the proposed adversarial training with feature separability (ATFS) enables to boost the intra-class feature similarity and increase inter-class feature variance.
Through comprehensive experiments, we demonstrate that the proposed ATFS framework significantly improves both clean and robust performance.
arXiv Detail & Related papers (2022-05-02T04:04:23Z) - Robustness through Cognitive Dissociation Mitigation in Contrastive
Adversarial Training [2.538209532048867]
We introduce a novel neural network training framework that increases model's adversarial robustness to adversarial attacks.
We propose to improve model robustness to adversarial attacks by learning feature representations consistent under both data augmentations and adversarial perturbations.
We validate our method on the CIFAR-10 dataset on which it outperforms both robust accuracy and clean accuracy over alternative supervised and self-supervised adversarial learning methods.
arXiv Detail & Related papers (2022-03-16T21:41:27Z) - Understanding the Logit Distributions of Adversarially-Trained Deep
Neural Networks [6.439477789066243]
Adversarial defenses train deep neural networks to be invariant to the input perturbations from adversarial attacks.
Although adversarial training is successful at mitigating adversarial attacks, the behavioral differences between adversarially-trained (AT) models and standard models are still poorly understood.
We identify three logit characteristics essential to learning adversarial robustness.
arXiv Detail & Related papers (2021-08-26T19:09:15Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - A Hamiltonian Monte Carlo Method for Probabilistic Adversarial Attack
and Learning [122.49765136434353]
We present an effective method, called Hamiltonian Monte Carlo with Accumulated Momentum (HMCAM), aiming to generate a sequence of adversarial examples.
We also propose a new generative method called Contrastive Adversarial Training (CAT), which approaches equilibrium distribution of adversarial examples.
Both quantitative and qualitative analysis on several natural image datasets and practical systems have confirmed the superiority of the proposed algorithm.
arXiv Detail & Related papers (2020-10-15T16:07:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.