Related papers: Beating Backdoor Attack at Its Own Game

Beating Backdoor Attack at Its Own Game

URL: http://arxiv.org/abs/2307.15539v3
Date: Fri, 4 Aug 2023 16:16:28 GMT
Title: Beating Backdoor Attack at Its Own Game
Authors: Min Liu, Alberto Sangiovanni-Vincentelli, Xiangyu Yue
Abstract summary: Deep neural networks (DNNs) are vulnerable to backdoor attack. Existing defense methods have greatly reduced attack success rate. We propose a highly effective framework which injects non-adversarial backdoors targeting poisoned samples.
Score: 10.131734154410763
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep neural networks (DNNs) are vulnerable to backdoor attack, which does not affect the network's performance on clean data but would manipulate the network behavior once a trigger pattern is added. Existing defense methods have greatly reduced attack success rate, but their prediction accuracy on clean data still lags behind a clean model by a large margin. Inspired by the stealthiness and effectiveness of backdoor attack, we propose a simple but highly effective defense framework which injects non-adversarial backdoors targeting poisoned samples. Following the general steps in backdoor attack, we detect a small set of suspected samples and then apply a poisoning strategy to them. The non-adversarial backdoor, once triggered, suppresses the attacker's backdoor on poisoned data, but has limited influence on clean data. The defense can be carried out during data preprocessing, without any modification to the standard end-to-end training pipeline. We conduct extensive experiments on multiple benchmarks with different architectures and representative attacks. Results demonstrate that our method achieves state-of-the-art defense effectiveness with by far the lowest performance drop on clean data. Considering the surprising defense ability displayed by our framework, we call for more attention to utilizing backdoor for backdoor defense. Code is available at https://github.com/damianliumin/non-adversarial_backdoor.

Related papers

Data Free Backdoor Attacks [83.10379074100453]
DFBA is a retraining-free and data-free backdoor attack without changing the model architecture. We verify that our injected backdoor is provably undetectable and unchosen by various state-of-the-art defenses. Our evaluation on multiple datasets demonstrates that our injected backdoor: 1) incurs negligible classification loss, 2) achieves 100% attack success rates, and 3) bypasses six existing state-of-the-art defenses.
arXiv Detail & Related papers (2024-12-09T05:30:25Z)
Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models. In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned. We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z)
UltraClean: A Simple Framework to Train Robust Neural Networks against Backdoor Attacks [19.369701116838776]
Backdoor attacks are emerging threats to deep neural networks. They typically embed malicious behaviors into a victim model by injecting poisoned samples. We propose UltraClean, a framework that simplifies the identification of poisoned samples.
arXiv Detail & Related papers (2023-12-17T09:16:17Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
BATT: Backdoor Attack with Transformation-based Triggers [72.61840273364311]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. Backdoor adversaries inject hidden backdoors that can be activated by adversary-specified trigger patterns. One recent research revealed that most of the existing attacks failed in the real physical world.
arXiv Detail & Related papers (2022-11-02T16:03:43Z)
Defending Against Backdoor Attack on Graph Nerual Network by Explainability [7.147386524788604]
We propose the first backdoor detection and defense method on GNN. For graph data, current backdoor attack focus on manipulating the graph structure to inject the trigger. We find that there are apparent differences between benign samples and malicious samples in some explanatory evaluation metrics.
arXiv Detail & Related papers (2022-09-07T03:19:29Z)
Test-Time Detection of Backdoor Triggers for Poisoned Deep Neural Networks [24.532269628999025]
Backdoor (Trojan) attacks are emerging threats against deep neural networks (DNN) In this paper, we propose an "in-flight" defense against backdoor attacks on image classification.
arXiv Detail & Related papers (2021-12-06T20:52:00Z)
Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks [58.0225587881455]
In this paper, we find two simple tricks that can make existing textual backdoor attacks much more harmful. The first trick is to add an extra training task to distinguish poisoned and clean data during the training of the victim model. The second one is to use all the clean training data rather than remove the original clean data corresponding to the poisoned data.
arXiv Detail & Related papers (2021-10-15T17:58:46Z)
Backdoor Attacks to Graph Neural Networks [73.56867080030091]
We propose the first backdoor attack to graph neural networks (GNN) In our backdoor attack, a GNN predicts an attacker-chosen target label for a testing graph once a predefined subgraph is injected to the testing graph. Our empirical results show that our backdoor attacks are effective with a small impact on a GNN's prediction accuracy for clean testing graphs.
arXiv Detail & Related papers (2020-06-19T14:51:01Z)
On Certifying Robustness against Backdoor Attacks via Randomized Smoothing [74.79764677396773]
We study the feasibility and effectiveness of certifying robustness against backdoor attacks using a recent technique called randomized smoothing. Our results show the theoretical feasibility of using randomized smoothing to certify robustness against backdoor attacks. Existing randomized smoothing methods have limited effectiveness at defending against backdoor attacks.
arXiv Detail & Related papers (2020-02-26T19:15:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.