Defense Against Multi-target Trojan Attacks
- URL: http://arxiv.org/abs/2207.03895v1
- Date: Fri, 8 Jul 2022 13:29:13 GMT
- Title: Defense Against Multi-target Trojan Attacks
- Authors: Haripriya Harikumar, Santu Rana, Kien Do, Sunil Gupta, Wei Zong, Willy
Susilo, Svetha Venkastesh
- Abstract summary: Trojan attacks are the hardest to defend against.
Badnet kind of attacks introduces Trojan backdoors to multiple target classes and allows triggers to be placed anywhere in the image.
To defend against this attack, we first introduce a trigger reverse-engineering mechanism that uses multiple images to recover a variety of potential triggers.
We then propose a detection mechanism by measuring the transferability of such recovered triggers.
- Score: 31.54111353219381
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Adversarial attacks on deep learning-based models pose a significant threat
to the current AI infrastructure. Among them, Trojan attacks are the hardest to
defend against. In this paper, we first introduce a variation of the Badnet
kind of attacks that introduces Trojan backdoors to multiple target classes and
allows triggers to be placed anywhere in the image. The former makes it more
potent and the latter makes it extremely easy to carry out the attack in the
physical space. The state-of-the-art Trojan detection methods fail with this
threat model. To defend against this attack, we first introduce a trigger
reverse-engineering mechanism that uses multiple images to recover a variety of
potential triggers. We then propose a detection mechanism by measuring the
transferability of such recovered triggers. A Trojan trigger will have very
high transferability i.e. they make other images also go to the same class. We
study many practical advantages of our attack method and then demonstrate the
detection performance using a variety of image datasets. The experimental
results show the superior detection performance of our method over the
state-of-the-arts.
Related papers
- Attention-Enhancing Backdoor Attacks Against BERT-based Models [54.070555070629105]
Investigating the strategies of backdoor attacks will help to understand the model's vulnerability.
We propose a novel Trojan Attention Loss (TAL) which enhances the Trojan behavior by directly manipulating the attention patterns.
arXiv Detail & Related papers (2023-10-23T01:24:56Z) - BppAttack: Stealthy and Efficient Trojan Attacks against Deep Neural
Networks via Image Quantization and Contrastive Adversarial Learning [13.959966918979395]
Deep neural networks are vulnerable to Trojan attacks.
Existing attacks use visible patterns as triggers, which are vulnerable to human inspection.
We propose stealthy and efficient Trojan attacks, BppAttack.
arXiv Detail & Related papers (2022-05-26T14:15:19Z) - Towards Effective and Robust Neural Trojan Defenses via Input Filtering [67.01177442955522]
Trojan attacks on deep neural networks are both dangerous and surreptitious.
Over the past few years, Trojan attacks have advanced from using only a simple trigger and targeting only one class to using many sophisticated triggers and targeting multiple classes.
Most defense methods still make out-of-date assumptions about Trojan triggers and target classes, thus, can be easily circumvented by modern Trojan attacks.
arXiv Detail & Related papers (2022-02-24T15:41:37Z) - CatchBackdoor: Backdoor Detection via Critical Trojan Neural Path Fuzzing [16.44147178061005]
trojaned behaviors triggered by various trojan attacks can be attributed to the trojan path.
We propose CatchBackdoor, a detection method against trojan attacks.
arXiv Detail & Related papers (2021-12-24T13:57:03Z) - Semantic Host-free Trojan Attack [54.25471812198403]
We propose a novel host-free Trojan attack with triggers that are fixed in the semantic space but not necessarily in the pixel space.
In contrast to existing Trojan attacks which use clean input images as hosts to carry small, meaningless trigger patterns, our attack considers triggers as full-sized images belonging to a semantically meaningful object class.
arXiv Detail & Related papers (2021-10-26T05:01:22Z) - Backdoor Attack in the Physical World [49.64799477792172]
Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs)
Most existing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and testing images.
We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2021-04-06T08:37:33Z) - Deep Feature Space Trojan Attack of Neural Networks by Controlled
Detoxification [21.631699720855995]
Trojan (backdoor) attack is a form of adversarial attack on deep neural networks.
We propose a novel deep feature space trojan attack with five characteristics.
arXiv Detail & Related papers (2020-12-21T09:46:12Z) - Odyssey: Creation, Analysis and Detection of Trojan Models [91.13959405645959]
Trojan attacks interfere with the training pipeline by inserting triggers into some of the training samples and trains the model to act maliciously only for samples that contain the trigger.
Existing Trojan detectors make strong assumptions about the types of triggers and attacks.
We propose a detector that is based on the analysis of the intrinsic properties; that are affected due to the Trojaning process.
arXiv Detail & Related papers (2020-07-16T06:55:00Z) - An Embarrassingly Simple Approach for Trojan Attack in Deep Neural
Networks [59.42357806777537]
trojan attack aims to attack deployed deep neural networks (DNNs) relying on hidden trigger patterns inserted by hackers.
We propose a training-free attack approach which is different from previous work, in which trojaned behaviors are injected by retraining model on a poisoned dataset.
The proposed TrojanNet has several nice properties including (1) it activates by tiny trigger patterns and keeps silent for other signals, (2) it is model-agnostic and could be injected into most DNNs, dramatically expanding its attack scenarios, and (3) the training-free mechanism saves massive training efforts compared to conventional trojan attack methods.
arXiv Detail & Related papers (2020-06-15T04:58:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.