Neural Polarizer: A Lightweight and Effective Backdoor Defense via
Purifying Poisoned Features
- URL: http://arxiv.org/abs/2306.16697v1
- Date: Thu, 29 Jun 2023 05:39:58 GMT
- Title: Neural Polarizer: A Lightweight and Effective Backdoor Defense via
Purifying Poisoned Features
- Authors: Mingli Zhu, Shaokui Wei, Hongyuan Zha, Baoyuan Wu
- Abstract summary: Recent studies have demonstrated the susceptibility of deep neural networks to backdoor attacks.
We propose a novel backdoor defense method by inserting a learnable neural polarizer into the backdoored model as an intermediate layer.
- Score: 62.82817831278743
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies have demonstrated the susceptibility of deep neural networks
to backdoor attacks. Given a backdoored model, its prediction of a poisoned
sample with trigger will be dominated by the trigger information, though
trigger information and benign information coexist. Inspired by the mechanism
of the optical polarizer that a polarizer could pass light waves with
particular polarizations while filtering light waves with other polarizations,
we propose a novel backdoor defense method by inserting a learnable neural
polarizer into the backdoored model as an intermediate layer, in order to
purify the poisoned sample via filtering trigger information while maintaining
benign information. The neural polarizer is instantiated as one lightweight
linear transformation layer, which is learned through solving a well designed
bi-level optimization problem, based on a limited clean dataset. Compared to
other fine-tuning-based defense methods which often adjust all parameters of
the backdoored model, the proposed method only needs to learn one additional
layer, such that it is more efficient and requires less clean data. Extensive
experiments demonstrate the effectiveness and efficiency of our method in
removing backdoors across various neural network architectures and datasets,
especially in the case of very limited clean data.
Related papers
- PAD-FT: A Lightweight Defense for Backdoor Attacks via Data Purification and Fine-Tuning [4.337364406035291]
Backdoor attacks pose a significant threat to deep neural networks.
We propose a novel mechanism, PAD-FT, that does not require an additional clean dataset and fine-tunes only a very small part of the model to disinfect the victim model.
Our mechanism demonstrates superior effectiveness across multiple backdoor attack methods and datasets.
arXiv Detail & Related papers (2024-09-18T15:47:23Z) - Fisher Information guided Purification against Backdoor Attacks [22.412186735687786]
We propose a novel backdoor purification framework, Fisher Information guided Purification (FIP)
FIP consists of a couple of novel regularizers that aid the model in suppressing the backdoor effects and retaining the acquired knowledge of clean data distribution.
In addition, we introduce an efficient variant of FIP, dubbed as Fast FIP, which reduces the number of tunable parameters significantly and obtains an impressive runtime gain of almost $5times$.
arXiv Detail & Related papers (2024-09-01T23:09:44Z) - Augmented Neural Fine-Tuning for Efficient Backdoor Purification [16.74156528484354]
Recent studies have revealed the vulnerability of deep neural networks (DNNs) to various backdoor attacks.
We propose Neural mask Fine-Tuning (NFT) with an aim to optimally re-organize the neuron activities.
NFT relaxes the trigger synthesis process and eliminates the requirement of the adversarial search module.
arXiv Detail & Related papers (2024-07-14T02:36:54Z) - PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection [57.571451139201855]
Prediction Shift Backdoor Detection (PSBD) is a novel method for identifying backdoor samples in deep neural networks.
PSBD is motivated by an intriguing Prediction Shift (PS) phenomenon, where poisoned models' predictions on clean data often shift away from true labels towards certain other labels.
PSBD identifies backdoor training samples by computing the Prediction Shift Uncertainty (PSU), the variance in probability values when dropout layers are toggled on and off during model inference.
arXiv Detail & Related papers (2024-06-09T15:31:00Z) - Lazy Layers to Make Fine-Tuned Diffusion Models More Traceable [70.77600345240867]
A novel arbitrary-in-arbitrary-out (AIAO) strategy makes watermarks resilient to fine-tuning-based removal.
Unlike the existing methods of designing a backdoor for the input/output space of diffusion models, in our method, we propose to embed the backdoor into the feature space of sampled subpaths.
Our empirical studies on the MS-COCO, AFHQ, LSUN, CUB-200, and DreamBooth datasets confirm the robustness of AIAO.
arXiv Detail & Related papers (2024-05-01T12:03:39Z) - Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method
Perspective [65.70799289211868]
We introduce two new theory-driven trigger pattern generation methods specialized for dataset distillation.
We show that our optimization-based trigger design framework informs effective backdoor attacks on dataset distillation.
arXiv Detail & Related papers (2023-11-28T09:53:05Z) - Polarized skylight orientation determination artificial neural network [4.834173456342489]
This paper proposes an artificial neural network to determine orientation using polarized skylight.
The degree of polarization (DOP) and angle of polarization (AOP) are directly extracted in the network.
arXiv Detail & Related papers (2021-07-06T00:19:22Z) - Adaptive conversion of real-valued input into spike trains [91.3755431537592]
This paper presents a biologically plausible method for converting real-valued input into spike trains for processing with spiking neural networks.
The proposed method mimics the adaptive behaviour of retinal ganglion cells and allows input neurons to adapt their response to changes in the statistics of the input.
arXiv Detail & Related papers (2021-04-12T12:33:52Z) - Face Anti-Spoofing by Learning Polarization Cues in a Real-World
Scenario [50.36920272392624]
Face anti-spoofing is the key to preventing security breaches in biometric recognition applications.
Deep learning method using RGB and infrared images demands a large amount of training data for new attacks.
We present a face anti-spoofing method in a real-world scenario by automatic learning the physical characteristics in polarization images of a real face.
arXiv Detail & Related papers (2020-03-18T03:04:03Z) - Polarizing Front Ends for Robust CNNs [23.451381552751393]
We propose a bottom-up strategy for attenuating adversarial perturbations using a nonlinear front end which polarizes and quantizes the data.
We observe that ideal polarization can be utilized to completely eliminate perturbations, develop algorithms to learn approximately polarizing bases for data, and investigate the effectiveness of the proposed strategy on the MNIST and Fashion MNIST datasets.
arXiv Detail & Related papers (2020-02-22T00:28:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.