Efficient Backdoor Removal Through Natural Gradient Fine-tuning
- URL: http://arxiv.org/abs/2306.17441v1
- Date: Fri, 30 Jun 2023 07:25:38 GMT
- Title: Efficient Backdoor Removal Through Natural Gradient Fine-tuning
- Authors: Nazmul Karim, Abdullah Al Arafat, Umar Khalid, Zhishan Guo and Naznin
Rahnavard
- Abstract summary: Recent backdoor attacks suggest that an adversary can take advantage of such training details and compromise the integrity of a deep neural network (DNN)
Our studies show that a backdoor model is usually optimized to a bad local minima, i.e. sharper minima as compared to a benign model.
We propose a novel backdoor technique, Natural Gradient Fine-tuning (NGF), which focuses on removing the backdoor by fine-tuning only one layer.
- Score: 4.753323975780736
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The success of a deep neural network (DNN) heavily relies on the details of
the training scheme; e.g., training data, architectures, hyper-parameters, etc.
Recent backdoor attacks suggest that an adversary can take advantage of such
training details and compromise the integrity of a DNN. Our studies show that a
backdoor model is usually optimized to a bad local minima, i.e. sharper minima
as compared to a benign model. Intuitively, a backdoor model can be purified by
reoptimizing the model to a smoother minima through fine-tuning with a few
clean validation data. However, fine-tuning all DNN parameters often requires
huge computational costs and often results in sub-par clean test performance.
To address this concern, we propose a novel backdoor purification technique,
Natural Gradient Fine-tuning (NGF), which focuses on removing the backdoor by
fine-tuning only one layer. Specifically, NGF utilizes a loss surface
geometry-aware optimizer that can successfully overcome the challenge of
reaching a smooth minima under a one-layer optimization scenario. To enhance
the generalization performance of our proposed method, we introduce a clean
data distribution-aware regularizer based on the knowledge of loss surface
curvature matrix, i.e., Fisher Information Matrix. Extensive experiments show
that the proposed method achieves state-of-the-art performance on a wide range
of backdoor defense benchmarks: four different datasets- CIFAR10, GTSRB,
Tiny-ImageNet, and ImageNet; 13 recent backdoor attacks, e.g. Blend, Dynamic,
WaNet, ISSBA, etc.
Related papers
- "No Matter What You Do": Purifying GNN Models via Backdoor Unlearning [33.07926413485209]
backdoor attacks in GNNs lie in the fact that the attacker modifies a portion of graph data by embedding triggers.
We present GCleaner, the first backdoor mitigation method on GNNs.
GCleaner can reduce the backdoor attack success rate to 10% with only 1% of clean data, and has almost negligible degradation in model performance.
arXiv Detail & Related papers (2024-10-02T06:30:49Z) - Fisher Information guided Purification against Backdoor Attacks [22.412186735687786]
We propose a novel backdoor purification framework, Fisher Information guided Purification (FIP)
FIP consists of a couple of novel regularizers that aid the model in suppressing the backdoor effects and retaining the acquired knowledge of clean data distribution.
In addition, we introduce an efficient variant of FIP, dubbed as Fast FIP, which reduces the number of tunable parameters significantly and obtains an impressive runtime gain of almost $5times$.
arXiv Detail & Related papers (2024-09-01T23:09:44Z) - Augmented Neural Fine-Tuning for Efficient Backdoor Purification [16.74156528484354]
Recent studies have revealed the vulnerability of deep neural networks (DNNs) to various backdoor attacks.
We propose Neural mask Fine-Tuning (NFT) with an aim to optimally re-organize the neuron activities.
NFT relaxes the trigger synthesis process and eliminates the requirement of the adversarial search module.
arXiv Detail & Related papers (2024-07-14T02:36:54Z) - OCGEC: One-class Graph Embedding Classification for DNN Backdoor Detection [18.11795712499763]
This study proposes a novel one-class classification framework called One-class Graph Embedding Classification (OCGEC)
OCGEC uses GNNs for model-level backdoor detection with only a little amount of clean data.
In comparison to other baselines, it achieves AUC scores of more than 98% on a number of tasks.
arXiv Detail & Related papers (2023-12-04T02:48:40Z) - Reconstructive Neuron Pruning for Backdoor Defense [96.21882565556072]
We propose a novel defense called emphReconstructive Neuron Pruning (RNP) to expose and prune backdoor neurons.
In RNP, unlearning is operated at the neuron level while recovering is operated at the filter level, forming an asymmetric reconstructive learning procedure.
We show that such an asymmetric process on only a few clean samples can effectively expose and prune the backdoor neurons implanted by a wide range of attacks.
arXiv Detail & Related papers (2023-05-24T08:29:30Z) - Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware
Minimization [27.964431092997504]
Fine-tuning based on benign data is a natural defense to erase the backdoor effect in a backdoored model.
We propose FTSAM, a novel backdoor defense paradigm that aims to shrink the norms of backdoor-related neurons by incorporating sharpness-aware minimization with fine-tuning.
arXiv Detail & Related papers (2023-04-24T05:13:52Z) - Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure.
We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z) - GDP: Stabilized Neural Network Pruning via Gates with Differentiable
Polarization [84.57695474130273]
Gate-based or importance-based pruning methods aim to remove channels whose importance is smallest.
GDP can be plugged before convolutional layers without bells and whistles, to control the on-and-off of each channel.
Experiments conducted over CIFAR-10 and ImageNet datasets show that the proposed GDP achieves the state-of-the-art performance.
arXiv Detail & Related papers (2021-09-06T03:17:10Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via
Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy.
Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.