Is It Possible to Backdoor Face Forgery Detection with Natural Triggers?
- URL: http://arxiv.org/abs/2401.00414v1
- Date: Sun, 31 Dec 2023 07:16:10 GMT
- Title: Is It Possible to Backdoor Face Forgery Detection with Natural Triggers?
- Authors: Xiaoxuan Han, Songlin Yang, Wei Wang, Ziwen He, Jing Dong
- Abstract summary: We propose a novel analysis-by-synthesis backdoor attack against face forgery detection models.
Our method achieves a high attack success rate (over 99%) and incurs a small model accuracy drop (below 0.2%) with a low poisoning rate (less than 3%)
- Score: 20.54640502001717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks have significantly improved the performance of face
forgery detection models in discriminating Artificial Intelligent Generated
Content (AIGC). However, their security is significantly threatened by the
injection of triggers during model training (i.e., backdoor attacks). Although
existing backdoor defenses and manual data selection can mitigate those using
human-eye-sensitive triggers, such as patches or adversarial noises, the more
challenging natural backdoor triggers remain insufficiently researched. To
further investigate natural triggers, we propose a novel analysis-by-synthesis
backdoor attack against face forgery detection models, which embeds natural
triggers in the latent space. We thoroughly study such backdoor vulnerability
from two perspectives: (1) Model Discrimination (Optimization-Based Trigger):
we adopt a substitute detection model and find the trigger by minimizing the
cross-entropy loss; (2) Data Distribution (Custom Trigger): we manipulate the
uncommon facial attributes in the long-tailed distribution to generate poisoned
samples without the supervision from detection models. Furthermore, to
completely evaluate the detection models towards the latest AIGC, we utilize
both state-of-the-art StyleGAN and Stable Diffusion for trigger generation.
Finally, these backdoor triggers introduce specific semantic features to the
generated poisoned samples (e.g., skin textures and smile), which are more
natural and robust. Extensive experiments show that our method is superior from
three levels: (1) Attack Success Rate: ours achieves a high attack success rate
(over 99%) and incurs a small model accuracy drop (below 0.2%) with a low
poisoning rate (less than 3%); (2) Backdoor Defense: ours shows better robust
performance when faced with existing backdoor defense methods; (3) Human
Inspection: ours is less human-eye-sensitive from a comprehensive user study.
Related papers
- Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations [50.1394620328318]
Existing backdoor attacks mainly focus on balanced datasets.
We propose an effective backdoor attack named Dynamic Data Augmentation Operation (D$2$AO)
Our method can achieve the state-of-the-art attack performance while preserving the clean accuracy.
arXiv Detail & Related papers (2024-10-16T18:44:22Z) - Poisoned Forgery Face: Towards Backdoor Attacks on Face Forgery
Detection [62.595450266262645]
This paper introduces a novel and previously unrecognized threat in face forgery detection scenarios caused by backdoor attack.
By embedding backdoors into models, attackers can deceive detectors into producing erroneous predictions for forged faces.
We propose emphPoisoned Forgery Face framework, which enables clean-label backdoor attacks on face forgery detectors.
arXiv Detail & Related papers (2024-02-18T06:31:05Z) - Backdoor Attack against One-Class Sequential Anomaly Detection Models [10.020488631167204]
We explore compromising deep sequential anomaly detection models by proposing a novel backdoor attack strategy.
The attack approach comprises two primary steps, trigger generation and backdoor injection.
Experiments demonstrate the effectiveness of our proposed attack strategy by injecting backdoors on two well-established one-class anomaly detection models.
arXiv Detail & Related papers (2024-02-15T19:19:54Z) - Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios.
Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples.
We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z) - Detecting Backdoors During the Inference Stage Based on Corruption
Robustness Consistency [33.42013309686333]
We propose a test-time trigger sample detection method that only needs the hard-label outputs of the victim models without any extra information.
Our journey begins with the intriguing observation that the backdoor-infected models have similar performance across different image corruptions for the clean images, but perform discrepantly for the trigger samples.
Extensive experiments demonstrate that compared with state-of-the-art defenses, TeCo outperforms them on different backdoor attacks, datasets, and model architectures.
arXiv Detail & Related papers (2023-03-27T07:10:37Z) - SATBA: An Invisible Backdoor Attack Based On Spatial Attention [7.405457329942725]
Backdoor attacks involve the training of Deep Neural Network (DNN) on datasets that contain hidden trigger patterns.
Most existing backdoor attacks suffer from two significant drawbacks: their trigger patterns are visible and easy to detect by backdoor defense or even human inspection.
We propose a novel backdoor attack named SATBA that overcomes these limitations using spatial attention and an U-net based model.
arXiv Detail & Related papers (2023-02-25T10:57:41Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Imperceptible and Robust Backdoor Attack in 3D Point Cloud [62.992167285646275]
We propose a novel imperceptible and robust backdoor attack (IRBA) to tackle this challenge.
We utilize a nonlinear and local transformation, called weighted local transformation (WLT), to construct poisoned samples with unique transformations.
Experiments on three benchmark datasets and four models show that IRBA achieves 80%+ ASR in most cases even with pre-processing techniques.
arXiv Detail & Related papers (2022-08-17T03:53:10Z) - Invisible Backdoor Attacks Using Data Poisoning in the Frequency Domain [8.64369418938889]
We propose a generalized backdoor attack method based on the frequency domain.
It can implement backdoor implantation without mislabeling and accessing the training process.
We evaluate our approach in the no-label and clean-label cases on three datasets.
arXiv Detail & Related papers (2022-07-09T07:05:53Z) - Imperceptible Backdoor Attack: From Input Space to Feature
Representation [24.82632240825927]
Backdoor attacks are rapidly emerging threats to deep neural networks (DNNs)
In this paper, we analyze the drawbacks of existing attack approaches and propose a novel imperceptible backdoor attack.
Our trigger only modifies less than 1% pixels of a benign image while the magnitude is 1.
arXiv Detail & Related papers (2022-05-06T13:02:26Z) - Few-shot Backdoor Defense Using Shapley Estimation [123.56934991060788]
We develop a new approach called Shapley Pruning to mitigate backdoor attacks on deep neural networks.
ShapPruning identifies the few infected neurons (under 1% of all neurons) and manages to protect the model's structure and accuracy.
Experiments demonstrate the effectiveness and robustness of our method against various attacks and tasks.
arXiv Detail & Related papers (2021-12-30T02:27:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.