Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware
Minimization
- URL: http://arxiv.org/abs/2304.11823v1
- Date: Mon, 24 Apr 2023 05:13:52 GMT
- Title: Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware
Minimization
- Authors: Mingli Zhu, Shaokui Wei, Li Shen, Yanbo Fan, Baoyuan Wu
- Abstract summary: Fine-tuning based on benign data is a natural defense to erase the backdoor effect in a backdoored model.
We propose FTSAM, a novel backdoor defense paradigm that aims to shrink the norms of backdoor-related neurons by incorporating sharpness-aware minimization with fine-tuning.
- Score: 27.964431092997504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor defense, which aims to detect or mitigate the effect of malicious
triggers introduced by attackers, is becoming increasingly critical for machine
learning security and integrity. Fine-tuning based on benign data is a natural
defense to erase the backdoor effect in a backdoored model. However, recent
studies show that, given limited benign data, vanilla fine-tuning has poor
defense performance. In this work, we provide a deep study of fine-tuning the
backdoored model from the neuron perspective and find that backdoorrelated
neurons fail to escape the local minimum in the fine-tuning process. Inspired
by observing that the backdoorrelated neurons often have larger norms, we
propose FTSAM, a novel backdoor defense paradigm that aims to shrink the norms
of backdoor-related neurons by incorporating sharpness-aware minimization with
fine-tuning. We demonstrate the effectiveness of our method on several
benchmark datasets and network architectures, where it achieves
state-of-the-art defense performance. Overall, our work provides a promising
avenue for improving the robustness of machine learning models against backdoor
attacks.
Related papers
- Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models [68.40324627475499]
We introduce a novel two-step defense framework named Expose Before You Defend.
EBYD unifies existing backdoor defense methods into a comprehensive defense system with enhanced performance.
We conduct extensive experiments on 10 image attacks and 6 text attacks across 2 vision datasets and 4 language datasets.
arXiv Detail & Related papers (2024-10-25T09:36:04Z) - Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Fusing Pruned and Backdoored Models: Optimal Transport-based Data-free Backdoor Mitigation [22.698855006036748]
Backdoor attacks present a serious security threat to deep neuron networks (DNNs)
We propose a novel data-free defense method named Optimal Transport-based Backdoor Repairing (OTBR) in this work.
To our knowledge, this is the first work to apply OT and model fusion techniques to backdoor defense.
arXiv Detail & Related papers (2024-08-28T15:21:10Z) - Unveiling and Mitigating Backdoor Vulnerabilities based on Unlearning Weight Changes and Backdoor Activeness [23.822040810285717]
Unlearning models with clean data and then learning a pruning mask have contributed to backdoor defense.
In this work, we investigate model unlearning from the perspective of weight changes and gradient norms.
In the first stage, an efficient Neuron Weight Change (NWC)-based Backdoor Reinitialization is proposed based on observation 1.
In the second stage, based on observation 2, we design an Activeness-Aware Fine-Tuning to replace the vanilla fine-tuning.
arXiv Detail & Related papers (2024-05-30T17:41:32Z) - Unified Neural Backdoor Removal with Only Few Clean Samples through Unlearning and Relearning [4.623498459985644]
Neural backdoors pose a serious security threat as they allow attackers to maliciously alter model behavior.
In this work, we introduce a novel method for comprehensive and effective elimination of backdoors, called ULRL.
arXiv Detail & Related papers (2024-05-23T16:49:09Z) - Reconstructive Neuron Pruning for Backdoor Defense [96.21882565556072]
We propose a novel defense called emphReconstructive Neuron Pruning (RNP) to expose and prune backdoor neurons.
In RNP, unlearning is operated at the neuron level while recovering is operated at the filter level, forming an asymmetric reconstructive learning procedure.
We show that such an asymmetric process on only a few clean samples can effectively expose and prune the backdoor neurons implanted by a wide range of attacks.
arXiv Detail & Related papers (2023-05-24T08:29:30Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure.
We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z) - Confidence Matters: Inspecting Backdoors in Deep Neural Networks via
Distribution Transfer [27.631616436623588]
We propose a backdoor defense DTInspector built upon a new observation.
DTInspector learns a patch that could change the predictions of most high-confidence data, and then decides the existence of backdoor.
arXiv Detail & Related papers (2022-08-13T08:16:28Z) - Few-shot Backdoor Defense Using Shapley Estimation [123.56934991060788]
We develop a new approach called Shapley Pruning to mitigate backdoor attacks on deep neural networks.
ShapPruning identifies the few infected neurons (under 1% of all neurons) and manages to protect the model's structure and accuracy.
Experiments demonstrate the effectiveness and robustness of our method against various attacks and tasks.
arXiv Detail & Related papers (2021-12-30T02:27:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.