Incorporating Hidden Layer representation into Adversarial Attacks and
Defences
- URL: http://arxiv.org/abs/2011.14045v2
- Date: Thu, 23 Jun 2022 03:01:08 GMT
- Title: Incorporating Hidden Layer representation into Adversarial Attacks and
Defences
- Authors: Haojing Shen, Sihong Chen, Ran Wang and Xizhao Wang
- Abstract summary: We propose a defence strategy to improve adversarial robustness by incorporating hidden layer representation.
This strategy can be regarded as an activation function which can be applied to any kind of neural network.
- Score: 9.756797357009567
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a defence strategy to improve adversarial
robustness by incorporating hidden layer representation. The key of this
defence strategy aims to compress or filter input information including
adversarial perturbation. And this defence strategy can be regarded as an
activation function which can be applied to any kind of neural network. We also
prove theoretically the effectiveness of this defense strategy under certain
conditions. Besides, incorporating hidden layer representation we propose three
types of adversarial attacks to generate three types of adversarial examples,
respectively. The experiments show that our defence method can significantly
improve the adversarial robustness of deep neural networks which achieves the
state-of-the-art performance even though we do not adopt adversarial training.
Related papers
- Fast Preemption: Forward-Backward Cascade Learning for Efficient and Transferable Proactive Adversarial Defense [13.252842556505174]
Deep learning technology has become untrustworthy due to its sensitivity to adversarial attacks.
We have devised a proactive strategy that preempts by safeguarding media upfront.
We have also devised the first, to our knowledge, effective white-box adaptive reversion attack.
arXiv Detail & Related papers (2024-07-22T10:23:44Z) - Improving Adversarial Robustness via Decoupled Visual Representation Masking [65.73203518658224]
In this paper, we highlight two novel properties of robust features from the feature distribution perspective.
We find that state-of-the-art defense methods aim to address both of these mentioned issues well.
Specifically, we propose a simple but effective defense based on decoupled visual representation masking.
arXiv Detail & Related papers (2024-06-16T13:29:41Z) - MPAT: Building Robust Deep Neural Networks against Textual Adversarial
Attacks [4.208423642716679]
We propose a malicious perturbation based adversarial training method (MPAT) for building robust deep neural networks against adversarial attacks.
Specifically, we construct a multi-level malicious example generation strategy to generate adversarial examples with malicious perturbations.
We employ a novel training objective function to ensure achieving the defense goal without compromising the performance on the original task.
arXiv Detail & Related papers (2024-02-29T01:49:18Z) - BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses.
We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z) - Exploring Adversarial Attacks and Defenses in Vision Transformers
trained with DINO [0.0]
This work conducts the first analysis on the robustness against adversarial attacks on self-supervised Vision Transformers trained using DINO.
First, we evaluate whether features learned through self-supervision are more robust to adversarial attacks than those emerging from supervised learning.
Then, we present properties arising for attacks in the latent space.
arXiv Detail & Related papers (2022-06-14T11:20:16Z) - Searching for an Effective Defender: Benchmarking Defense against
Adversarial Word Substitution [83.84968082791444]
Deep neural networks are vulnerable to intentionally crafted adversarial examples.
Various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models.
arXiv Detail & Related papers (2021-08-29T08:11:36Z) - Mitigating Gradient-based Adversarial Attacks via Denoising and
Compression [7.305019142196582]
Gradient-based adversarial attacks on deep neural networks pose a serious threat.
They can be deployed by adding imperceptible perturbations to the test data of any network.
Denoising and dimensionality reduction are two distinct methods that have been investigated to combat such attacks.
arXiv Detail & Related papers (2021-04-03T22:57:01Z) - Combating Adversaries with Anti-Adversaries [118.70141983415445]
In particular, our layer generates an input perturbation in the opposite direction of the adversarial one.
We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models.
Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.
arXiv Detail & Related papers (2021-03-26T09:36:59Z) - AdvFoolGen: Creating Persistent Troubles for Deep Classifiers [17.709146615433458]
We present a new black-box attack termed AdvFoolGen, which can generate attacking images from the same feature space as that of the natural images.
We demonstrate the effectiveness and robustness of our attack in the face of state-of-the-art defense techniques.
arXiv Detail & Related papers (2020-07-20T21:27:41Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z) - Deflecting Adversarial Attacks [94.85315681223702]
We present a new approach towards ending this cycle where we "deflect" adversarial attacks by causing the attacker to produce an input that resembles the attack's target class.
We first propose a stronger defense based on Capsule Networks that combines three detection mechanisms to achieve state-of-the-art detection performance.
arXiv Detail & Related papers (2020-02-18T06:59:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.