Defense against Backdoor Attacks via Identifying and Purifying Bad
Neurons
- URL: http://arxiv.org/abs/2208.06537v1
- Date: Sat, 13 Aug 2022 01:10:20 GMT
- Title: Defense against Backdoor Attacks via Identifying and Purifying Bad
Neurons
- Authors: Mingyuan Fan, Yang Liu, Cen Chen, Ximeng Liu, Wenzhong Guo
- Abstract summary: We propose a novel backdoor defense method to mark and purify infected neurons in neural networks.
New metric, called benign salience, can identify infected neurons with higher accuracy than the commonly used metric in backdoor defense.
New Adaptive Regularization (AR) mechanism is proposed to assist in purifying these identified infected neurons.
- Score: 36.57541102989073
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The opacity of neural networks leads their vulnerability to backdoor attacks,
where hidden attention of infected neurons is triggered to override normal
predictions to the attacker-chosen ones. In this paper, we propose a novel
backdoor defense method to mark and purify the infected neurons in the
backdoored neural networks. Specifically, we first define a new metric, called
benign salience. By combining the first-order gradient to retain the
connections between neurons, benign salience can identify the infected neurons
with higher accuracy than the commonly used metric in backdoor defense. Then, a
new Adaptive Regularization (AR) mechanism is proposed to assist in purifying
these identified infected neurons via fine-tuning. Due to the ability to adapt
to different magnitudes of parameters, AR can provide faster and more stable
convergence than the common regularization mechanism in neuron purifying.
Extensive experimental results demonstrate that our method can erase the
backdoor in neural networks with negligible performance degradation.
Related papers
- Magnitude-based Neuron Pruning for Backdoor Defens [3.056446348447152]
Deep Neural Networks (DNNs) are known to be vulnerable to backdoor attacks.
Recent research reveals that backdoors can be erased from infected DNNs by pruning a specific group of neurons.
We propose a Magnitude-based Neuron Pruning (MNP) method to detect and prune backdoor neurons.
arXiv Detail & Related papers (2024-05-28T02:05:39Z) - Rethinking Pruning for Backdoor Mitigation: An Optimization Perspective [19.564985801521814]
We propose an optimized Neuron Pruning (ONP) method combined with Graph Neural Network (GNN) and Reinforcement Learning (RL) to repair backdoor models.
With a small amount of clean data, ONP can effectively prune the backdoor neurons implanted by a set of backdoor attacks at the cost of negligible performance degradation.
arXiv Detail & Related papers (2024-05-28T01:59:06Z) - Reconstructive Neuron Pruning for Backdoor Defense [96.21882565556072]
We propose a novel defense called emphReconstructive Neuron Pruning (RNP) to expose and prune backdoor neurons.
In RNP, unlearning is operated at the neuron level while recovering is operated at the filter level, forming an asymmetric reconstructive learning procedure.
We show that such an asymmetric process on only a few clean samples can effectively expose and prune the backdoor neurons implanted by a wide range of attacks.
arXiv Detail & Related papers (2023-05-24T08:29:30Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Improving Adversarial Transferability via Neuron Attribution-Based
Attacks [35.02147088207232]
We propose the Neuron-based Attack (NAA), which conducts feature-level attacks with more accurate neuron importance estimations.
We derive an approximation scheme of neuron attribution to tremendously reduce the overhead.
Experiments confirm the superiority of our approach to the state-of-the-art benchmarks.
arXiv Detail & Related papers (2022-03-31T13:47:30Z) - Adversarial Robustness in Deep Learning: Attacks on Fragile Neurons [0.6899744489931016]
We identify fragile and robust neurons of deep learning architectures using nodal dropouts of the first convolutional layer.
We correlate these neurons with the distribution of adversarial attacks on the network.
arXiv Detail & Related papers (2022-01-31T14:34:07Z) - Few-shot Backdoor Defense Using Shapley Estimation [123.56934991060788]
We develop a new approach called Shapley Pruning to mitigate backdoor attacks on deep neural networks.
ShapPruning identifies the few infected neurons (under 1% of all neurons) and manages to protect the model's structure and accuracy.
Experiments demonstrate the effectiveness and robustness of our method against various attacks and tasks.
arXiv Detail & Related papers (2021-12-30T02:27:03Z) - And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks.
We define AND-like neurons and propose measures to increase their proportion in the network.
Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z) - Artificial Neural Variability for Deep Learning: On Overfitting, Noise
Memorization, and Catastrophic Forgetting [135.0863818867184]
artificial neural variability (ANV) helps artificial neural networks learn some advantages from natural'' neural networks.
ANV plays as an implicit regularizer of the mutual information between the training data and the learned model.
It can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
arXiv Detail & Related papers (2020-11-12T06:06:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.