Related papers: Isolate Trigger: Detecting and Eradicating Evade-Adaptive Backdoors

Isolate Trigger: Detecting and Eradicating Evade-Adaptive Backdoors

URL: http://arxiv.org/abs/2508.04094v1
Date: Wed, 06 Aug 2025 05:21:40 GMT
Title: Isolate Trigger: Detecting and Eradicating Evade-Adaptive Backdoors
Authors: Chengrui Sun, Hua Zhang, Haoran Gao, Zian Tian, Jianjin Zhao, qi Li, Hongliang Zhu, Zongliang Shen, Shang Wang, Anmin Fu,
Abstract summary: We introduce a precise, efficient and universal detection and defense framework coined as Isolate Trigger (IsTr)<n>IsTr aims to find the hidden trigger by breaking the barrier of the source features.<n>We rigorously evaluated the effectiveness of the IsTr against a series of six EAB attacks.
Score: 10.061164320086181
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: All current detection of backdoor attacks on deep learning models fall under the category of a non essential features(NEF), which focus on fighting against simple and efficient vertical class backdoor -- trigger is small, few and not overlapping with the source. Evade-adaptive backdoor (EAB) attacks have evaded NEF detection and improved training efficiency. We introduces a precise, efficient and universal detection and defense framework coined as Isolate Trigger (IsTr). IsTr aims to find the hidden trigger by breaking the barrier of the source features. Therefore, it investigates the essence of backdoor triggering, and uses Steps and Differential-Middle-Slice as components to update past theories of distance and gradient. IsTr also plays a positive role in the model, whether the backdoor exists. For example, accurately find and repair the wrong identification caused by deliberate or unintentional training in automatic driving. Extensive experiments on robustness scross various tasks, including MNIST, facial recognition, and traffic sign recognition, confirm the high efficiency, generality and precision of the IsTr. We rigorously evaluated the effectiveness of the IsTr against a series of six EAB attacks, including Badnets, Sin-Wave, Multi-trigger, SSBAs, CASSOCK, HCB. None of these countermeasures evade, even when attacks are combined and the trigger and source overlap.

Related papers

Coward: Toward Practical Proactive Federated Backdoor Defense via Collision-based Watermark [90.94234374893287]
We introduce a new proactive defense, dubbed Coward, inspired by our discovery of multi-backdoor collision effects.<n>In general, we detect attackers by evaluating whether the server-injected, conflicting global watermark is erased during local training rather than retained.
arXiv Detail & Related papers (2025-08-04T06:51:33Z)
A4O: All Trigger for One sample [10.78460062665304]
We show that proposed backdoor defenders often rely on the assumption that triggers would appear in a unified way.<n>In this paper, we show that this naive assumption can create a loophole, allowing more sophisticated backdoor attacks to bypass.<n>We design a novel backdoor attack mechanism that incorporates multiple types of backdoor triggers, focusing on stealthiness and effectiveness.
arXiv Detail & Related papers (2025-01-13T10:38:58Z)
DMGNN: Detecting and Mitigating Backdoor Attacks in Graph Neural Networks [30.766013737094532]
We propose DMGNN against out-of-distribution (OOD) and in-distribution (ID) graph backdoor attacks. DMGNN can easily identify the hidden ID and OOD triggers via predicting label transitions based on counterfactual explanation. DMGNN far outperforms the state-of-the-art (SOTA) defense methods, reducing the attack success rate to 5% with almost negligible degradation in model performance.
arXiv Detail & Related papers (2024-10-18T01:08:03Z)
Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning. This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities. In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z)
Evolutionary Trigger Detection and Lightweight Model Repair Based Backdoor Defense [10.310546695762467]
Deep Neural Networks (DNNs) have been widely used in many areas such as autonomous driving and face recognition. A backdoor in the DNN model can be activated by a poisoned input with trigger and leads to wrong prediction. We propose an efficient backdoor defense based on evolutionary trigger detection and lightweight model repair.
arXiv Detail & Related papers (2024-07-07T14:50:59Z)
LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning [49.174341192722615]
Backdoor attack poses a significant security threat to Deep Learning applications. Recent papers have introduced attacks using sample-specific invisible triggers crafted through special transformation functions. We introduce a novel backdoor attack LOTUS to address both evasiveness and resilience.
arXiv Detail & Related papers (2024-03-25T21:01:29Z)
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses. We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z)
Backdoor Mitigation by Correcting the Distribution of Neural Activations [30.554700057079867]
Backdoor (Trojan) attacks are an important type of adversarial exploit against deep neural networks (DNNs) We analyze an important property of backdoor attacks: a successful attack causes an alteration in the distribution of internal layer activations for backdoor-trigger instances. We propose an efficient and effective method that achieves post-training backdoor mitigation by correcting the distribution alteration.
arXiv Detail & Related papers (2023-08-18T22:52:29Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
Gradient Shaping: Enhancing Backdoor Attack Against Reverse Engineering [39.11590429626592]
gradient-based trigger inversion is considered to be among the most effective backdoor detection techniques. Our study shows that existing attacks tend to inject the backdoor characterized by a low change rate around trigger-carrying inputs. We design a new attack enhancement called textitGradient Shaping (GRASP) to reduce the change rate of a backdoored model with regard to the trigger.
arXiv Detail & Related papers (2023-01-29T01:17:46Z)
Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure. We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.