Related papers: Verifying Neural Networks Against Backdoor Attacks

Verifying Neural Networks Against Backdoor Attacks

URL: http://arxiv.org/abs/2205.06992v1
Date: Sat, 14 May 2022 07:25:54 GMT
Title: Verifying Neural Networks Against Backdoor Attacks
Authors: Long H. Pham and Jun Sun
Abstract summary: We propose an approach to verify whether a given neural network is free of backdoor with a certain level of success rate. Experiment results show that our approach effectively verifies the absence of backdoor or generates backdoor triggers.
Score: 7.5033553032683855
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural networks have achieved state-of-the-art performance in solving many problems, including many applications in safety/security-critical systems. Researchers also discovered multiple security issues associated with neural networks. One of them is backdoor attacks, i.e., a neural network may be embedded with a backdoor such that a target output is almost always generated in the presence of a trigger. Existing defense approaches mostly focus on detecting whether a neural network is 'backdoored' based on heuristics, e.g., activation patterns. To the best of our knowledge, the only line of work which certifies the absence of backdoor is based on randomized smoothing, which is known to significantly reduce neural network performance. In this work, we propose an approach to verify whether a given neural network is free of backdoor with a certain level of success rate. Our approach integrates statistical sampling as well as abstract interpretation. The experiment results show that our approach effectively verifies the absence of backdoor or generates backdoor triggers.

Related papers

Reconstructive Neuron Pruning for Backdoor Defense [96.21882565556072]
We propose a novel defense called emphReconstructive Neuron Pruning (RNP) to expose and prune backdoor neurons. In RNP, unlearning is operated at the neuron level while recovering is operated at the filter level, forming an asymmetric reconstructive learning procedure. We show that such an asymmetric process on only a few clean samples can effectively expose and prune the backdoor neurons implanted by a wide range of attacks.
arXiv Detail & Related papers (2023-05-24T08:29:30Z)
FreeEagle: Detecting Complex Neural Trojans in Data-Free Cases [50.065022493142116]
Trojan attack on deep neural networks, also known as backdoor attack, is a typical threat to artificial intelligence. FreeEagle is the first data-free backdoor detection method that can effectively detect complex backdoor attacks.
arXiv Detail & Related papers (2023-02-28T11:31:29Z)
An anomaly detection approach for backdoored neural networks: face recognition as a case study [77.92020418343022]
We propose a novel backdoored network detection method based on the principle of anomaly detection. We test our method on a novel dataset of backdoored networks and report detectability results with perfect scores.
arXiv Detail & Related papers (2022-08-22T12:14:13Z)
AEVA: Black-box Backdoor Detection Using Adversarial Extreme Value Analysis [23.184335982913325]
We address the black-box hard-label backdoor detection problem. We show that the objective of backdoor detection is bounded by an adversarial objective. We propose the adversarial extreme value analysis to detect backdoors in black-box neural networks.
arXiv Detail & Related papers (2021-10-28T04:36:48Z)
Check Your Other Door! Establishing Backdoor Attacks in the Frequency Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks. We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z)
Can You Hear It? Backdoor Attacks via Ultrasonic Triggers [31.147899305987934]
In this work, we explore the option of backdoor attacks to automatic speech recognition systems where we inject inaudible triggers. Our results indicate that less than 1% of poisoned data is sufficient to deploy a backdoor attack and reach a 100% attack success rate.
arXiv Detail & Related papers (2021-07-30T12:08:16Z)
Noise-Response Analysis of Deep Neural Networks Quantifies Robustness and Fingerprints Structural Malware [48.7072217216104]
Deep neural networks (DNNs) have structural malware' (i.e., compromised weights and activation pathways) It is generally difficult to detect backdoors, and existing detection methods are computationally expensive and require extensive resources (e.g., access to the training data) Here, we propose a rapid feature-generation technique that quantifies the robustness of a DNN, fingerprints' its nonlinearity, and allows us to detect backdoors (if present) Our empirical results demonstrate that we can accurately detect backdoors with high confidence orders-of-magnitude faster than existing approaches (seconds versus
arXiv Detail & Related papers (2020-07-31T23:52:58Z)
Defending against Backdoor Attack on Deep Neural Networks [98.45955746226106]
We study the so-called textitbackdoor attack, which injects a backdoor trigger to a small portion of training data. Experiments show that our method could effectively decrease the attack success rate, and also hold a high classification accuracy for clean images.
arXiv Detail & Related papers (2020-02-26T02:03:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.