ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep
Learning Paradigms
- URL: http://arxiv.org/abs/2302.11408v2
- Date: Sun, 6 Aug 2023 17:24:21 GMT
- Title: ASSET: Robust Backdoor Data Detection Across a Multiplicity of Deep
Learning Paradigms
- Authors: Minzhou Pan, Yi Zeng, Lingjuan Lyu, Xue Lin and Ruoxi Jia
- Abstract summary: Backdoor data detection is traditionally studied in an end-to-end supervised learning (SL) setting.
Recent years have seen the proliferating adoption of self-supervised learning (SSL) and transfer learning (TL) due to their lesser need for labeled data.
We show that the performance of most existing detection methods varies significantly across different attacks and poison ratios, and all fail on the state-of-the-art clean-label attack.
- Score: 39.753721029332326
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor data detection is traditionally studied in an end-to-end supervised
learning (SL) setting. However, recent years have seen the proliferating
adoption of self-supervised learning (SSL) and transfer learning (TL), due to
their lesser need for labeled data. Successful backdoor attacks have also been
demonstrated in these new settings. However, we lack a thorough understanding
of the applicability of existing detection methods across a variety of learning
settings. By evaluating 56 attack settings, we show that the performance of
most existing detection methods varies significantly across different attacks
and poison ratios, and all fail on the state-of-the-art clean-label attack. In
addition, they either become inapplicable or suffer large performance losses
when applied to SSL and TL. We propose a new detection method called Active
Separation via Offset (ASSET), which actively induces different model behaviors
between the backdoor and clean samples to promote their separation. We also
provide procedures to adaptively select the number of suspicious points to
remove. In the end-to-end SL setting, ASSET is superior to existing methods in
terms of consistency of defensive performance across different attacks and
robustness to changes in poison ratios; in particular, it is the only method
that can detect the state-of-the-art clean-label attack. Moreover, ASSET's
average detection rates are higher than the best existing methods in SSL and
TL, respectively, by 69.3% and 33.2%, thus providing the first practical
backdoor defense for these new DL settings. We open-source the project to drive
further development and encourage engagement:
https://github.com/ruoxi-jia-group/ASSET.
Related papers
- Backdoor Defense through Self-Supervised and Generative Learning [0.0]
Training on such data injects a backdoor which causes malicious inference in selected test samples.
This paper explores an approach based on generative modelling of per-class distributions in a self-supervised representation space.
In both cases, we find that per-class generative models allow to detect poisoned data and cleanse the dataset.
arXiv Detail & Related papers (2024-09-02T11:40:01Z) - Towards Adversarial Robustness And Backdoor Mitigation in SSL [0.562479170374811]
Self-Supervised Learning (SSL) has shown great promise in learning representations from unlabeled data.
SSL methods have recently been shown to be vulnerable to backdoor attacks.
This work aims to address defending against backdoor attacks in SSL.
arXiv Detail & Related papers (2024-03-23T19:21:31Z) - Does Few-shot Learning Suffer from Backdoor Attacks? [63.9864247424967]
We show that few-shot learning can still be vulnerable to backdoor attacks.
Our method demonstrates a high Attack Success Rate (ASR) in FSL tasks with different few-shot learning paradigms.
This study reveals that few-shot learning still suffers from backdoor attacks, and its security should be given attention.
arXiv Detail & Related papers (2023-12-31T06:43:36Z) - Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking [65.44477004525231]
Researchers have recently found that Self-Supervised Learning (SSL) is vulnerable to backdoor attacks.
In this paper, we propose to erase the SSL backdoor by cluster activation masking and propose a novel PoisonCAM method.
Our method achieves 96% accuracy for backdoor trigger detection compared to 3% of the state-of-the-art method on poisoned ImageNet-100.
arXiv Detail & Related papers (2023-12-13T08:01:15Z) - FLTracer: Accurate Poisoning Attack Provenance in Federated Learning [38.47921452675418]
Federated Learning (FL) is a promising distributed learning approach that enables multiple clients to collaboratively train a shared global model.
Recent studies show that FL is vulnerable to various poisoning attacks, which can degrade the performance of global models or introduce backdoors into them.
We propose FLTracer, the first FL attack framework to accurately detect various attacks and trace the attack time, objective, type, and poisoned location of updates.
arXiv Detail & Related papers (2023-10-20T11:24:38Z) - Improved Activation Clipping for Universal Backdoor Mitigation and
Test-Time Detection [27.62279831135902]
Deep neural networks are vulnerable toTrojan attacks, where an attacker poisons the training set with backdoor triggers.
Recent work shows that backdoor poisoning induces over-fitting (abnormally large activations) in the attacked model.
We devise a new such approach, choosing the activation bounds to explicitly limit classification margins.
arXiv Detail & Related papers (2023-08-08T22:47:39Z) - An Embarrassingly Simple Backdoor Attack on Self-supervised Learning [52.28670953101126]
Self-supervised learning (SSL) is capable of learning high-quality representations of complex data without relying on labels.
We study the inherent vulnerability of SSL to backdoor attacks.
arXiv Detail & Related papers (2022-10-13T20:39:21Z) - Federated Zero-Shot Learning for Visual Recognition [55.65879596326147]
We propose a novel Federated Zero-Shot Learning FedZSL framework.
FedZSL learns a central model from the decentralized data residing on edge devices.
The effectiveness and robustness of FedZSL are demonstrated by extensive experiments conducted on three zero-shot benchmark datasets.
arXiv Detail & Related papers (2022-09-05T14:49:34Z) - Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types.
In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.