Universal Detection of Backdoor Attacks via Density-based Clustering and
Centroids Analysis
- URL: http://arxiv.org/abs/2301.04554v2
- Date: Thu, 5 Oct 2023 13:26:33 GMT
- Title: Universal Detection of Backdoor Attacks via Density-based Clustering and
Centroids Analysis
- Authors: Wei Guo, Benedetta Tondi, Mauro Barni
- Abstract summary: We propose a Universal Defence against backdoor attacks based on Clustering and Centroids Analysis (CCA-UD)
The goal of the defence is to reveal whether a Deep Neural Network model is subject to a backdoor attack by inspecting the training dataset.
- Score: 24.953032059932525
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a Universal Defence against backdoor attacks based on Clustering
and Centroids Analysis (CCA-UD). The goal of the defence is to reveal whether a
Deep Neural Network model is subject to a backdoor attack by inspecting the
training dataset. CCA-UD first clusters the samples of the training set by
means of density-based clustering. Then, it applies a novel strategy to detect
the presence of poisoned clusters. The proposed strategy is based on a general
misclassification behaviour observed when the features of a representative
example of the analysed cluster are added to benign samples. The capability of
inducing a misclassification error is a general characteristic of poisoned
samples, hence the proposed defence is attack-agnostic. This marks a
significant difference with respect to existing defences, that, either can
defend against only some types of backdoor attacks, or are effective only when
some conditions on the poisoning ratio or the kind of triggering signal used by
the attacker are satisfied.
Experiments carried out on several classification tasks and network
architectures, considering different types of backdoor attacks (with either
clean or corrupted labels), and triggering signals, including both global and
local triggering signals, as well as sample-specific and source-specific
triggers, reveal that the proposed method is very effective to defend against
backdoor attacks in all the cases, always outperforming the state of the art
techniques.
Related papers
- Detecting Adversarial Data via Perturbation Forgery [28.637963515748456]
adversarial detection aims to identify and filter out adversarial data from the data flow based on discrepancies in distribution and noise patterns between natural and adversarial data.
New attacks based on generative models with imbalanced and anisotropic noise patterns evade detection.
We propose Perturbation Forgery, which includes noise distribution perturbation, sparse mask generation, and pseudo-adversarial data production, to train an adversarial detector capable of detecting unseen gradient-based, generative-model-based, and physical adversarial attacks.
arXiv Detail & Related papers (2024-05-25T13:34:16Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning
Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks.
FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain.
We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z) - Adaptive Perturbation Generation for Multiple Backdoors Detection [29.01715186371785]
This paper proposes the Adaptive Perturbation Generation (APG) framework to detect multiple types of backdoor attacks.
We first design the global-to-local strategy to fit the multiple types of backdoor triggers.
To further increase the efficiency of perturbation injection, we introduce a gradient-guided mask generation strategy.
arXiv Detail & Related papers (2022-09-12T13:37:06Z) - On Trace of PGD-Like Adversarial Attacks [77.75152218980605]
Adversarial attacks pose safety and security concerns for deep learning applications.
We construct Adrial Response Characteristics (ARC) features to reflect the model's gradient consistency.
Our method is intuitive, light-weighted, non-intrusive, and data-undemanding.
arXiv Detail & Related papers (2022-05-19T14:26:50Z) - AntidoteRT: Run-time Detection and Correction of Poison Attacks on
Neural Networks [18.461079157949698]
backdoor poisoning attacks against image classification networks.
We propose lightweight automated detection and correction techniques against poisoning attacks.
Our technique outperforms existing defenses such as NeuralCleanse and STRIP on popular benchmarks.
arXiv Detail & Related papers (2022-01-31T23:42:32Z) - Towards Defending against Adversarial Examples via Attack-Invariant
Features [147.85346057241605]
Deep neural networks (DNNs) are vulnerable to adversarial noise.
adversarial robustness can be improved by exploiting adversarial examples.
Models trained on seen types of adversarial examples generally cannot generalize well to unseen types of adversarial examples.
arXiv Detail & Related papers (2021-06-09T12:49:54Z) - Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data.
We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level.
Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z) - A Self-supervised Approach for Adversarial Robustness [105.88250594033053]
Adversarial examples can cause catastrophic mistakes in Deep Neural Network (DNNs) based vision systems.
This paper proposes a self-supervised adversarial training mechanism in the input space.
It provides significant robustness against the textbfunseen adversarial attacks.
arXiv Detail & Related papers (2020-06-08T20:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.