Related papers: Handcrafted Backdoors in Deep Neural Networks

Handcrafted Backdoors in Deep Neural Networks

URL: http://arxiv.org/abs/2106.04690v1
Date: Tue, 8 Jun 2021 20:58:23 GMT
Title: Handcrafted Backdoors in Deep Neural Networks
Authors: Sanghyun Hong, Nicholas Carlini, Alexey Kurakin
Abstract summary: We introduce a handcrafted attack that directly manipulates the parameters of a pre-trained model to inject backdoors. Our backdoors remain effective across four datasets and four network architectures with a success rate above 96%. Our results suggest that further research is needed for understanding the complete space of supply-chain backdoor attacks.
Score: 33.21980707457639
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks (DNNs), while accurate, are expensive to train. Many practitioners, therefore, outsource the training process to third parties or use pre-trained DNNs. This practice makes DNNs vulnerable to $backdoor$ $attacks$: the third party who trains the model may act maliciously to inject hidden behaviors into the otherwise accurate model. Until now, the mechanism to inject backdoors has been limited to $poisoning$. We argue that such a supply-chain attacker has more attack techniques available. To study this hypothesis, we introduce a handcrafted attack that directly manipulates the parameters of a pre-trained model to inject backdoors. Our handcrafted attacker has more degrees of freedom in manipulating model parameters than poisoning. This makes it difficult for a defender to identify or remove the manipulations with straightforward methods, such as statistical analysis, adding random noises to model parameters, or clipping their values within a certain range. Further, our attacker can combine the handcrafting process with additional techniques, $e.g.$, jointly optimizing a trigger pattern, to inject backdoors into complex networks effectively$-$the meet-in-the-middle attack. In evaluations, our handcrafted backdoors remain effective across four datasets and four network architectures with a success rate above 96%. Our backdoored models are resilient to both parameter-level backdoor removal techniques and can evade existing defenses by slightly changing the backdoor attack configurations. Moreover, we demonstrate the feasibility of suppressing unwanted behaviors otherwise caused by poisoning. Our results suggest that further research is needed for understanding the complete space of supply-chain backdoor attacks.

Related papers

Data Free Backdoor Attacks [83.10379074100453]
DFBA is a retraining-free and data-free backdoor attack without changing the model architecture. We verify that our injected backdoor is provably undetectable and unchosen by various state-of-the-art defenses. Our evaluation on multiple datasets demonstrates that our injected backdoor: 1) incurs negligible classification loss, 2) achieves 100% attack success rates, and 3) bypasses six existing state-of-the-art defenses.
arXiv Detail & Related papers (2024-12-09T05:30:25Z)
Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models. In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned. We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
Backdoor Defense via Deconfounded Representation Learning [17.28760299048368]
We propose a Causality-inspired Backdoor Defense (CBD) to learn deconfounded representations for reliable classification. CBD is effective in reducing backdoor threats while maintaining high accuracy in predicting benign samples.
arXiv Detail & Related papers (2023-03-13T02:25:59Z)
BATT: Backdoor Attack with Transformation-based Triggers [72.61840273364311]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. Backdoor adversaries inject hidden backdoors that can be activated by adversary-specified trigger patterns. One recent research revealed that most of the existing attacks failed in the real physical world.
arXiv Detail & Related papers (2022-11-02T16:03:43Z)
Check Your Other Door! Establishing Backdoor Attacks in the Frequency Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks. We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z)
Black-box Detection of Backdoor Attacks with Limited Information and Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model. In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)
Light Can Hack Your Face! Black-box Backdoor Attack on Face Recognition Systems [0.0]
We propose a novel black-box backdoor attack technique on face recognition systems. We show that the backdoor trigger can be quite effective, where the attack success rate can be up to $88%$. We highlight that our study revealed a new physical backdoor attack, which calls for the attention of the security issue of the existing face recognition/verification techniques.
arXiv Detail & Related papers (2020-09-15T11:50:29Z)
Defending against Backdoor Attack on Deep Neural Networks [98.45955746226106]
We study the so-called textitbackdoor attack, which injects a backdoor trigger to a small portion of training data. Experiments show that our method could effectively decrease the attack success rate, and also hold a high classification accuracy for clean images.
arXiv Detail & Related papers (2020-02-26T02:03:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.