Anti-Backdoor Learning: Training Clean Models on Poisoned Data
- URL: http://arxiv.org/abs/2110.11571v2
- Date: Mon, 25 Oct 2021 03:41:22 GMT
- Title: Anti-Backdoor Learning: Training Clean Models on Poisoned Data
- Authors: Yige Li, Xixiang Lyu, Nodens Koren, Lingjuan Lyu, Bo Li, Xingjun Ma
- Abstract summary: Backdoor attack has emerged as a major security threat to deep neural networks (DNNs)
We introduce the concept of emphanti-backdoor learning, aiming to train emphclean models given backdoor-poisoned data.
We empirically show that ABL-trained models on backdoor-poisoned data achieve the same performance as they were trained on purely clean data.
- Score: 17.648453598314795
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Backdoor attack has emerged as a major security threat to deep neural
networks (DNNs). While existing defense methods have demonstrated promising
results on detecting or erasing backdoors, it is still not clear whether robust
training methods can be devised to prevent the backdoor triggers being injected
into the trained model in the first place. In this paper, we introduce the
concept of \emph{anti-backdoor learning}, aiming to train \emph{clean} models
given backdoor-poisoned data. We frame the overall learning process as a
dual-task of learning the \emph{clean} and the \emph{backdoor} portions of
data. From this view, we identify two inherent characteristics of backdoor
attacks as their weaknesses: 1) the models learn backdoored data much faster
than learning with clean data, and the stronger the attack the faster the model
converges on backdoored data; 2) the backdoor task is tied to a specific class
(the backdoor target class). Based on these two weaknesses, we propose a
general learning scheme, Anti-Backdoor Learning (ABL), to automatically prevent
backdoor attacks during training. ABL introduces a two-stage \emph{gradient
ascent} mechanism for standard training to 1) help isolate backdoor examples at
an early training stage, and 2) break the correlation between backdoor examples
and the target class at a later training stage. Through extensive experiments
on multiple benchmark datasets against 10 state-of-the-art attacks, we
empirically show that ABL-trained models on backdoor-poisoned data achieve the
same performance as they were trained on purely clean data. Code is available
at \url{https://github.com/bboylyg/ABL}.
Related papers
- Flatness-aware Sequential Learning Generates Resilient Backdoors [7.969181278996343]
Recently, backdoor attacks have become an emerging threat to the security of machine learning models.
This paper counters CF of backdoors by leveraging continual learning (CL) techniques.
We propose a novel framework, named Sequential Backdoor Learning (SBL), that can generate resilient backdoors.
arXiv Detail & Related papers (2024-07-20T03:30:05Z) - Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models.
In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned.
We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z) - Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation.
Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them.
We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z) - Backdoor Defense via Adaptively Splitting Poisoned Dataset [57.70673801469096]
Backdoor defenses have been studied to alleviate the threat of deep neural networks (DNNs) being backdoor attacked and maliciously altered.
We argue that the core of training-time defense is to select poisoned samples and to handle them properly.
Under our framework, we propose an adaptively splitting dataset-based defense (ASD)
arXiv Detail & Related papers (2023-03-23T02:16:38Z) - Backdoor Defense via Deconfounded Representation Learning [17.28760299048368]
We propose a Causality-inspired Backdoor Defense (CBD) to learn deconfounded representations for reliable classification.
CBD is effective in reducing backdoor threats while maintaining high accuracy in predicting benign samples.
arXiv Detail & Related papers (2023-03-13T02:25:59Z) - Can Backdoor Attacks Survive Time-Varying Models? [35.836598031681426]
Backdoors are powerful attacks against deep neural networks (DNNs)
We study the impact of backdoor attacks on a more realistic scenario of time-varying DNN models.
Our results show that one-shot backdoor attacks do not survive past a few model updates.
arXiv Detail & Related papers (2022-06-08T01:32:49Z) - Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks [58.0225587881455]
In this paper, we find two simple tricks that can make existing textual backdoor attacks much more harmful.
The first trick is to add an extra training task to distinguish poisoned and clean data during the training of the victim model.
The second one is to use all the clean training data rather than remove the original clean data corresponding to the poisoned data.
arXiv Detail & Related papers (2021-10-15T17:58:46Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z) - Backdoor Learning: A Survey [75.59571756777342]
Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs)
Backdoor learning is an emerging and rapidly growing research area.
This paper presents the first comprehensive survey of this realm.
arXiv Detail & Related papers (2020-07-17T04:09:20Z) - Blind Backdoors in Deep Learning Models [22.844973592524966]
We investigate a new method for injecting backdoors into machine learning models, based on compromising the loss-value computation in the model-training code.
We use it to demonstrate new classes of backdoors strictly more powerful than those in the prior literature.
Our attack is blind: the attacker cannot modify the training data, nor observe the execution of his code, nor access the resulting model.
arXiv Detail & Related papers (2020-05-08T02:15:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.