Deep Feature Space Trojan Attack of Neural Networks by Controlled
Detoxification
- URL: http://arxiv.org/abs/2012.11212v2
- Date: Mon, 4 Jan 2021 04:10:38 GMT
- Title: Deep Feature Space Trojan Attack of Neural Networks by Controlled
Detoxification
- Authors: Siyuan Cheng, Yingqi Liu, Shiqing Ma, Xiangyu Zhang
- Abstract summary: Trojan (backdoor) attack is a form of adversarial attack on deep neural networks.
We propose a novel deep feature space trojan attack with five characteristics.
- Score: 21.631699720855995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Trojan (backdoor) attack is a form of adversarial attack on deep neural
networks where the attacker provides victims with a model trained/retrained on
malicious data. The backdoor can be activated when a normal input is stamped
with a certain pattern called trigger, causing misclassification. Many existing
trojan attacks have their triggers being input space patches/objects (e.g., a
polygon with solid color) or simple input transformations such as Instagram
filters. These simple triggers are susceptible to recent backdoor detection
algorithms. We propose a novel deep feature space trojan attack with five
characteristics: effectiveness, stealthiness, controllability, robustness and
reliance on deep features. We conduct extensive experiments on 9 image
classifiers on various datasets including ImageNet to demonstrate these
properties and show that our attack can evade state-of-the-art defense.
Related papers
- Attention-Enhancing Backdoor Attacks Against BERT-based Models [54.070555070629105]
Investigating the strategies of backdoor attacks will help to understand the model's vulnerability.
We propose a novel Trojan Attention Loss (TAL) which enhances the Trojan behavior by directly manipulating the attention patterns.
arXiv Detail & Related papers (2023-10-23T01:24:56Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - BATT: Backdoor Attack with Transformation-based Triggers [72.61840273364311]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
Backdoor adversaries inject hidden backdoors that can be activated by adversary-specified trigger patterns.
One recent research revealed that most of the existing attacks failed in the real physical world.
arXiv Detail & Related papers (2022-11-02T16:03:43Z) - BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural
Networks via Image Quantization and Contrastive Adversarial Learning [13.959966918979395]
Deep neural networks are vulnerable to Trojan attacks.
Existing attacks use visible patterns as triggers, which are vulnerable to human inspection.
We propose stealthy and efficient Trojan attacks, BppAttack.
arXiv Detail & Related papers (2022-05-26T14:15:19Z) - Backdoor Attack in the Physical World [49.64799477792172]
Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs)
Most existing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and testing images.
We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2021-04-06T08:37:33Z) - Input-Aware Dynamic Backdoor Attack [9.945411554349276]
In recent years, neural backdoor attack has been considered to be a potential security threat to deep learning systems.
Current backdoor techniques rely on uniform trigger patterns, which are easily detected and mitigated by current defense methods.
We propose a novel backdoor attack technique in which the triggers vary from input to input.
arXiv Detail & Related papers (2020-10-16T03:57:12Z) - An Embarrassingly Simple Approach for Trojan Attack in Deep Neural
Networks [59.42357806777537]
trojan attack aims to attack deployed deep neural networks (DNNs) relying on hidden trigger patterns inserted by hackers.
We propose a training-free attack approach which is different from previous work, in which trojaned behaviors are injected by retraining model on a poisoned dataset.
The proposed TrojanNet has several nice properties including (1) it activates by tiny trigger patterns and keeps silent for other signals, (2) it is model-agnostic and could be injected into most DNNs, dramatically expanding its attack scenarios, and (3) the training-free mechanism saves massive training efforts compared to conventional trojan attack methods.
arXiv Detail & Related papers (2020-06-15T04:58:28Z) - Rethinking the Trigger of Backdoor Attack [83.98031510668619]
Currently, most of existing backdoor attacks adopted the setting of emphstatic trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area.
We demonstrate that such an attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2020-04-09T17:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.