Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor
Poisoned Samples in DNNs
- URL: http://arxiv.org/abs/2303.13211v1
- Date: Thu, 23 Mar 2023 12:11:24 GMT
- Title: Don't FREAK Out: A Frequency-Inspired Approach to Detecting Backdoor
Poisoned Samples in DNNs
- Authors: Hasan Abed Al Kader Hammoud, Adel Bibi, Philip H.S. Torr, Bernard
Ghanem
- Abstract summary: In this paper, we investigate the frequency sensitivity of Deep Neural Networks (DNNs) when presented with clean samples versus poisoned samples.
We propose a frequency-based poisoned sample detection algorithm that is simple yet effective.
- Score: 130.965542948104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we investigate the frequency sensitivity of Deep Neural
Networks (DNNs) when presented with clean samples versus poisoned samples. Our
analysis shows significant disparities in frequency sensitivity between these
two types of samples. Building on these findings, we propose FREAK, a
frequency-based poisoned sample detection algorithm that is simple yet
effective. Our experimental results demonstrate the efficacy of FREAK not only
against frequency backdoor attacks but also against some spatial attacks. Our
work is just the first step in leveraging these insights. We believe that our
analysis and proposed defense mechanism will provide a foundation for future
research and development of backdoor defenses.
Related papers
- T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models [70.03122709795122]
We propose a comprehensive defense method named T2IShield to detect, localize, and mitigate backdoor attacks.
We find the "Assimilation Phenomenon" on the cross-attention maps caused by the backdoor trigger.
For backdoor sample detection, T2IShield achieves a detection F1 score of 88.9$%$ with low computational cost.
arXiv Detail & Related papers (2024-07-05T01:53:21Z) - PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection [57.571451139201855]
Prediction Shift Backdoor Detection (PSBD) is a novel method for identifying backdoor samples in deep neural networks.
PSBD is motivated by an intriguing Prediction Shift (PS) phenomenon, where poisoned models' predictions on clean data often shift away from true labels towards certain other labels.
PSBD identifies backdoor training samples by computing the Prediction Shift Uncertainty (PSU), the variance in probability values when dropout layers are toggled on and off during model inference.
arXiv Detail & Related papers (2024-06-09T15:31:00Z) - The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data [4.9676716806872125]
backdoor attacks have posed a serious security threat to the training process of deep neural networks (DNNs)
We propose a novel dual-network training framework: The Victim and The Beneficiary (V&B), which exploits a poisoned model to train a clean model without extra benign samples.
Our framework is effective in preventing backdoor injection and robust to various attacks while maintaining the performance on benign samples.
arXiv Detail & Related papers (2024-04-17T11:15:58Z) - Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios.
Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples.
We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Defending Against Backdoor Attacks by Layer-wise Feature Analysis [11.465401472704732]
Training deep neural networks (DNNs) usually requires massive training data and computational resources.
A new training-time attack (i.e., backdoor attack) aims to induce misclassification of input samples containing adversary-specified trigger patterns.
We propose a simple yet effective method to filter poisoned samples by analyzing the feature differences between suspicious and benign samples at the critical layer.
arXiv Detail & Related papers (2023-02-24T17:16:37Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Invisible Backdoor Attacks Using Data Poisoning in the Frequency Domain [8.64369418938889]
We propose a generalized backdoor attack method based on the frequency domain.
It can implement backdoor implantation without mislabeling and accessing the training process.
We evaluate our approach in the no-label and clean-label cases on three datasets.
arXiv Detail & Related papers (2022-07-09T07:05:53Z) - PiDAn: A Coherence Optimization Approach for Backdoor Attack Detection
and Mitigation in Deep Neural Networks [22.900501880865658]
Backdoor attacks impose a new threat in Deep Neural Networks (DNNs)
We propose PiDAn, an algorithm based on coherence optimization purifying the poisoned data.
Our PiDAn algorithm can detect more than 90% infected classes and identify 95% poisoned samples.
arXiv Detail & Related papers (2022-03-17T12:37:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.