Efficient Backdoor Removal Through Natural Gradient Fine-tuning
- URL: http://arxiv.org/abs/2306.17441v1
- Date: Fri, 30 Jun 2023 07:25:38 GMT
- Title: Efficient Backdoor Removal Through Natural Gradient Fine-tuning
- Authors: Nazmul Karim, Abdullah Al Arafat, Umar Khalid, Zhishan Guo and Naznin
Rahnavard
- Abstract summary: Recent backdoor attacks suggest that an adversary can take advantage of such training details and compromise the integrity of a deep neural network (DNN)
Our studies show that a backdoor model is usually optimized to a bad local minima, i.e. sharper minima as compared to a benign model.
We propose a novel backdoor technique, Natural Gradient Fine-tuning (NGF), which focuses on removing the backdoor by fine-tuning only one layer.
- Score: 4.753323975780736
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The success of a deep neural network (DNN) heavily relies on the details of
the training scheme; e.g., training data, architectures, hyper-parameters, etc.
Recent backdoor attacks suggest that an adversary can take advantage of such
training details and compromise the integrity of a DNN. Our studies show that a
backdoor model is usually optimized to a bad local minima, i.e. sharper minima
as compared to a benign model. Intuitively, a backdoor model can be purified by
reoptimizing the model to a smoother minima through fine-tuning with a few
clean validation data. However, fine-tuning all DNN parameters often requires
huge computational costs and often results in sub-par clean test performance.
To address this concern, we propose a novel backdoor purification technique,
Natural Gradient Fine-tuning (NGF), which focuses on removing the backdoor by
fine-tuning only one layer. Specifically, NGF utilizes a loss surface
geometry-aware optimizer that can successfully overcome the challenge of
reaching a smooth minima under a one-layer optimization scenario. To enhance
the generalization performance of our proposed method, we introduce a clean
data distribution-aware regularizer based on the knowledge of loss surface
curvature matrix, i.e., Fisher Information Matrix. Extensive experiments show
that the proposed method achieves state-of-the-art performance on a wide range
of backdoor defense benchmarks: four different datasets- CIFAR10, GTSRB,
Tiny-ImageNet, and ImageNet; 13 recent backdoor attacks, e.g. Blend, Dynamic,
WaNet, ISSBA, etc.
Related papers
- Augmented Neural Fine-Tuning for Efficient Backdoor Purification [16.74156528484354]
Recent studies have revealed the vulnerability of deep neural networks (DNNs) to various backdoor attacks.
We propose Neural mask Fine-Tuning (NFT) with an aim to optimally re-organize the neuron activities.
NFT relaxes the trigger synthesis process and eliminates the requirement of the adversarial search module.
arXiv Detail & Related papers (2024-07-14T02:36:54Z) - OCGEC: One-class Graph Embedding Classification for DNN Backdoor Detection [18.11795712499763]
This study proposes a novel one-class classification framework called One-class Graph Embedding Classification (OCGEC)
OCGEC uses GNNs for model-level backdoor detection with only a little amount of clean data.
In comparison to other baselines, it achieves AUC scores of more than 98% on a number of tasks.
arXiv Detail & Related papers (2023-12-04T02:48:40Z) - Reconstructive Neuron Pruning for Backdoor Defense [96.21882565556072]
We propose a novel defense called emphReconstructive Neuron Pruning (RNP) to expose and prune backdoor neurons.
In RNP, unlearning is operated at the neuron level while recovering is operated at the filter level, forming an asymmetric reconstructive learning procedure.
We show that such an asymmetric process on only a few clean samples can effectively expose and prune the backdoor neurons implanted by a wide range of attacks.
arXiv Detail & Related papers (2023-05-24T08:29:30Z) - Enhancing Fine-Tuning Based Backdoor Defense with Sharpness-Aware
Minimization [27.964431092997504]
Fine-tuning based on benign data is a natural defense to erase the backdoor effect in a backdoored model.
We propose FTSAM, a novel backdoor defense paradigm that aims to shrink the norms of backdoor-related neurons by incorporating sharpness-aware minimization with fine-tuning.
arXiv Detail & Related papers (2023-04-24T05:13:52Z) - Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure.
We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z) - One-shot Neural Backdoor Erasing via Adversarial Weight Masking [8.345632941376673]
Adversarial Weight Masking (AWM) is a novel method capable of erasing the neural backdoors even in the one-shot setting.
AWM can largely improve the purifying effects over other state-of-the-art methods on various available training dataset sizes.
arXiv Detail & Related papers (2022-07-10T16:18:39Z) - Don't Knock! Rowhammer at the Backdoor of DNN Models [19.13129153353046]
We present an end-to-end backdoor injection attack realized on actual hardware on a model using Rowhammer as the fault injection method.
We propose a novel network training algorithm based on constrained optimization to achieve a realistic backdoor injection attack in hardware.
arXiv Detail & Related papers (2021-10-14T19:43:53Z) - GDP: Stabilized Neural Network Pruning via Gates with Differentiable
Polarization [84.57695474130273]
Gate-based or importance-based pruning methods aim to remove channels whose importance is smallest.
GDP can be plugged before convolutional layers without bells and whistles, to control the on-and-off of each channel.
Experiments conducted over CIFAR-10 and ImageNet datasets show that the proposed GDP achieves the state-of-the-art performance.
arXiv Detail & Related papers (2021-09-06T03:17:10Z) - Effective Model Sparsification by Scheduled Grow-and-Prune Methods [73.03533268740605]
We propose a novel scheduled grow-and-prune (GaP) methodology without pre-training the dense models.
Experiments have shown that such models can match or beat the quality of highly optimized dense models at 80% sparsity on a variety of tasks.
arXiv Detail & Related papers (2021-06-18T01:03:13Z) - A Simple Fine-tuning Is All You Need: Towards Robust Deep Learning Via
Adversarial Fine-tuning [90.44219200633286]
We propose a simple yet very effective adversarial fine-tuning approach based on a $textitslow start, fast decay$ learning rate scheduling strategy.
Experimental results show that the proposed adversarial fine-tuning approach outperforms the state-of-the-art methods on CIFAR-10, CIFAR-100 and ImageNet datasets.
arXiv Detail & Related papers (2020-12-25T20:50:15Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.