Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers
- URL: http://arxiv.org/abs/2202.05470v1
- Date: Fri, 11 Feb 2022 06:15:56 GMT
- Title: Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers
- Authors: Limin Yang, Zhi Chen, Jacopo Cortellazzi, Feargus Pendlebury, Kevin
Tu, Fabio Pierazzi, Lorenzo Cavallaro, Gang Wang
- Abstract summary: We show that backdoor attacks in malware classifiers are still detectable by recent defenses.
We propose a new attack, Jigsaw Puzzle, based on the key observation that malware authors have little to no incentive to protect any other authors' malware.
JP learns a trigger to complement the latent patterns of the malware author's samples, and activates the backdoor only when the trigger and the latent pattern are pieced together in a sample.
- Score: 25.129280695319473
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Malware classifiers are subject to training-time exploitation due to the need
to regularly retrain using samples collected from the wild. Recent work has
demonstrated the feasibility of backdoor attacks against malware classifiers,
and yet the stealthiness of such attacks is not well understood. In this paper,
we investigate this phenomenon under the clean-label setting (i.e., attackers
do not have complete control over the training or labeling process).
Empirically, we show that existing backdoor attacks in malware classifiers are
still detectable by recent defenses such as MNTD. To improve stealthiness, we
propose a new attack, Jigsaw Puzzle (JP), based on the key observation that
malware authors have little to no incentive to protect any other authors'
malware but their own. As such, Jigsaw Puzzle learns a trigger to complement
the latent patterns of the malware author's samples, and activates the backdoor
only when the trigger and the latent pattern are pieced together in a sample.
We further focus on realizable triggers in the problem space (e.g., software
code) using bytecode gadgets broadly harvested from benign software. Our
evaluation confirms that Jigsaw Puzzle is effective as a backdoor, remains
stealthy against state-of-the-art defenses, and is a threat in realistic
settings that depart from reasoning about feature-space only attacks. We
conclude by exploring promising approaches to improve backdoor defenses.
Related papers
- BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses.
We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z) - Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation.
Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them.
We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Kallima: A Clean-label Framework for Textual Backdoor Attacks [25.332731545200808]
We propose the first clean-label framework Kallima for synthesizing mimesis-style backdoor samples.
We modify inputs belonging to the target class with adversarial perturbations, making the model rely more on the backdoor trigger.
arXiv Detail & Related papers (2022-06-03T21:44:43Z) - Contributor-Aware Defenses Against Adversarial Backdoor Attacks [2.830541450812474]
adversarial backdoor attacks have demonstrated the capability to perform targeted misclassification of specific examples.
We propose a contributor-aware universal defensive framework for learning in the presence of multiple, potentially adversarial data sources.
Our empirical studies demonstrate the robustness of the proposed framework against adversarial backdoor attacks from multiple simultaneous adversaries.
arXiv Detail & Related papers (2022-05-28T20:25:34Z) - Backdoor Attack against NLP models with Robustness-Aware Perturbation
defense [0.0]
Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs)
In our work, we break this defense by controlling the robustness gap between poisoned and clean samples using adversarial training step.
arXiv Detail & Related papers (2022-04-08T10:08:07Z) - Backdoor Attack in the Physical World [49.64799477792172]
Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs)
Most existing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and testing images.
We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2021-04-06T08:37:33Z) - Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural
Networks [25.23881974235643]
We show that backdoor attacks induce a smoother decision function around the triggered samples -- a phenomenon which we refer to as textitbackdoor smoothing.
Our experiments show that smoothness increases when the trigger is added to the input samples, and that this phenomenon is more pronounced for more successful attacks.
arXiv Detail & Related papers (2020-06-11T18:28:54Z) - Rethinking the Trigger of Backdoor Attack [83.98031510668619]
Currently, most of existing backdoor attacks adopted the setting of emphstatic trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area.
We demonstrate that such an attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2020-04-09T17:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.