Backdoor Cleansing with Unlabeled Data
- URL: http://arxiv.org/abs/2211.12044v4
- Date: Thu, 29 Jun 2023 21:30:38 GMT
- Title: Backdoor Cleansing with Unlabeled Data
- Authors: Lu Pang, Tao Sun, Haibin Ling, Chao Chen
- Abstract summary: externally trained Deep Neural Networks (DNNs) can potentially be backdoor attacked.
We propose a novel defense method that does not require training labels.
Our method, trained without labels, is on-par with state-of-the-art defense methods trained using labels.
- Score: 70.29989887008209
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the increasing computational demand of Deep Neural Networks (DNNs),
companies and organizations have begun to outsource the training process.
However, the externally trained DNNs can potentially be backdoor attacked. It
is crucial to defend against such attacks, i.e., to postprocess a suspicious
model so that its backdoor behavior is mitigated while its normal prediction
power on clean inputs remain uncompromised. To remove the abnormal backdoor
behavior, existing methods mostly rely on additional labeled clean samples.
However, such requirement may be unrealistic as the training data are often
unavailable to end users. In this paper, we investigate the possibility of
circumventing such barrier. We propose a novel defense method that does not
require training labels. Through a carefully designed layer-wise weight
re-initialization and knowledge distillation, our method can effectively
cleanse backdoor behaviors of a suspicious network with negligible compromise
in its normal behavior. In experiments, we show that our method, trained
without labels, is on-par with state-of-the-art defense methods trained using
labels. We also observe promising defense results even on out-of-distribution
data. This makes our method very practical. Code is available at:
https://github.com/luluppang/BCU.
Related papers
- Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models.
In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned.
We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z) - Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation.
Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them.
We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Backdoor Mitigation in Deep Neural Networks via Strategic Retraining [0.0]
Deep Neural Networks (DNN) are becoming increasingly important in assisted and automated driving.
One particular problem is that they are prone to hidden backdoors.
In this paper, we introduce a novel method to remove backdoors.
arXiv Detail & Related papers (2022-12-14T15:22:32Z) - Invisible Backdoor Attacks Using Data Poisoning in the Frequency Domain [8.64369418938889]
We propose a generalized backdoor attack method based on the frequency domain.
It can implement backdoor implantation without mislabeling and accessing the training process.
We evaluate our approach in the no-label and clean-label cases on three datasets.
arXiv Detail & Related papers (2022-07-09T07:05:53Z) - Anti-Backdoor Learning: Training Clean Models on Poisoned Data [17.648453598314795]
Backdoor attack has emerged as a major security threat to deep neural networks (DNNs)
We introduce the concept of emphanti-backdoor learning, aiming to train emphclean models given backdoor-poisoned data.
We empirically show that ABL-trained models on backdoor-poisoned data achieve the same performance as they were trained on purely clean data.
arXiv Detail & Related papers (2021-10-22T03:30:48Z) - Check Your Other Door! Establishing Backdoor Attacks in the Frequency
Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks.
We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z) - Handcrafted Backdoors in Deep Neural Networks [33.21980707457639]
We introduce a handcrafted attack that directly manipulates the parameters of a pre-trained model to inject backdoors.
Our backdoors remain effective across four datasets and four network architectures with a success rate above 96%.
Our results suggest that further research is needed for understanding the complete space of supply-chain backdoor attacks.
arXiv Detail & Related papers (2021-06-08T20:58:23Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.