Invariance-powered Trustworthy Defense via Remove Then Restore
- URL: http://arxiv.org/abs/2402.00304v1
- Date: Thu, 1 Feb 2024 03:34:48 GMT
- Title: Invariance-powered Trustworthy Defense via Remove Then Restore
- Authors: Xiaowei Fu, Yuhang Zhou, Lina Ma, and Lei Zhang
- Abstract summary: Adrial attacks pose a challenge to the deployment of deep neural networks (DNNs)
Key finding is that salient attack in an adversarial sample dominates the attacking process.
A Pixel Surgery and Semantic Regeneration model following the targeted therapy mechanism is developed.
- Score: 7.785824663793149
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial attacks pose a challenge to the deployment of deep neural
networks (DNNs), while previous defense models overlook the generalization to
various attacks. Inspired by targeted therapies for cancer, we view adversarial
samples as local lesions of natural benign samples, because a key finding is
that salient attack in an adversarial sample dominates the attacking process,
while trivial attack unexpectedly provides trustworthy evidence for obtaining
generalizable robustness. Based on this finding, a Pixel Surgery and Semantic
Regeneration (PSSR) model following the targeted therapy mechanism is
developed, which has three merits: 1) To remove the salient attack, a
score-based Pixel Surgery module is proposed, which retains the trivial attack
as a kind of invariance information. 2) To restore the discriminative content,
a Semantic Regeneration module based on a conditional alignment extrapolator is
proposed, which achieves pixel and semantic consistency. 3) To further
harmonize robustness and accuracy, an intractable problem, a self-augmentation
regularizer with adversarial R-drop is designed. Experiments on numerous
benchmarks show the superiority of PSSR.
Related papers
- AdvQDet: Detecting Query-Based Adversarial Attacks with Adversarial Contrastive Prompt Tuning [93.77763753231338]
Adversarial Contrastive Prompt Tuning (ACPT) is proposed to fine-tune the CLIP image encoder to extract similar embeddings for any two intermediate adversarial queries.
We show that ACPT can detect 7 state-of-the-art query-based attacks with $>99%$ detection rate within 5 shots.
We also show that ACPT is robust to 3 types of adaptive attacks.
arXiv Detail & Related papers (2024-08-04T09:53:50Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - LEAT: Towards Robust Deepfake Disruption in Real-World Scenarios via
Latent Ensemble Attack [11.764601181046496]
Deepfakes, malicious visual contents created by generative models, pose an increasingly harmful threat to society.
To proactively mitigate deepfake damages, recent studies have employed adversarial perturbation to disrupt deepfake model outputs.
We propose a simple yet effective disruption method called Latent Ensemble ATtack (LEAT), which attacks the independent latent encoding process.
arXiv Detail & Related papers (2023-07-04T07:00:37Z) - Adversarial Amendment is the Only Force Capable of Transforming an Enemy
into a Friend [29.172689524555015]
Adversarial attack is commonly regarded as a huge threat to neural networks because of misleading behavior.
This paper presents an opposite perspective: adversarial attacks can be harnessed to improve neural models if amended correctly.
arXiv Detail & Related papers (2023-05-18T07:13:02Z) - Improving Adversarial Robustness to Sensitivity and Invariance Attacks
with Deep Metric Learning [80.21709045433096]
A standard method in adversarial robustness assumes a framework to defend against samples crafted by minimally perturbing a sample.
We use metric learning to frame adversarial regularization as an optimal transport problem.
Our preliminary results indicate that regularizing over invariant perturbations in our framework improves both invariant and sensitivity defense.
arXiv Detail & Related papers (2022-11-04T13:54:02Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - Adversarial Attack and Defense in Deep Ranking [100.17641539999055]
We propose two attacks against deep ranking systems that can raise or lower the rank of chosen candidates by adversarial perturbations.
Conversely, an anti-collapse triplet defense is proposed to improve the ranking model robustness against all proposed attacks.
Our adversarial ranking attacks and defenses are evaluated on MNIST, Fashion-MNIST, CUB200-2011, CARS196 and Stanford Online Products datasets.
arXiv Detail & Related papers (2021-06-07T13:41:45Z) - Towards Adversarial Patch Analysis and Certified Defense against Crowd
Counting [61.99564267735242]
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems.
Recent studies have demonstrated that deep neural network (DNN) methods are vulnerable to adversarial attacks.
We propose a robust attack strategy called Adversarial Patch Attack with Momentum to evaluate the robustness of crowd counting models.
arXiv Detail & Related papers (2021-04-22T05:10:55Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.