Related papers: FooBaR: Fault Fooling Backdoor Attack on Neural Network Training

FooBaR: Fault Fooling Backdoor Attack on Neural Network Training

URL: http://arxiv.org/abs/2109.11249v1
Date: Thu, 23 Sep 2021 09:43:19 GMT
Title: FooBaR: Fault Fooling Backdoor Attack on Neural Network Training
Authors: Jakub Breier, Xiaolu Hou, Mart\'in Ochoa and Jesus Solano
Abstract summary: We explore a novel attack paradigm by injecting faults during the training phase of a neural network in a way that the resulting network can be attacked during deployment without the necessity of further faulting. We call such attacks fooling backdoors as the fault attacks at the training phase inject backdoors into the network that allow an attacker to produce fooling inputs.
Score: 5.639451539396458
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural network implementations are known to be vulnerable to physical attack vectors such as fault injection attacks. As of now, these attacks were only utilized during the inference phase with the intention to cause a misclassification. In this work, we explore a novel attack paradigm by injecting faults during the training phase of a neural network in a way that the resulting network can be attacked during deployment without the necessity of further faulting. In particular, we discuss attacks against ReLU activation functions that make it possible to generate a family of malicious inputs, which are called fooling inputs, to be used at inference time to induce controlled misclassifications. Such malicious inputs are obtained by mathematically solving a system of linear equations that would cause a particular behaviour on the attacked activation functions, similar to the one induced in training through faulting. We call such attacks fooling backdoors as the fault attacks at the training phase inject backdoors into the network that allow an attacker to produce fooling inputs. We evaluate our approach against multi-layer perceptron networks and convolutional networks on a popular image classification task obtaining high attack success rates (from 60% to 100%) and high classification confidence when as little as 25 neurons are attacked while preserving high accuracy on the originally intended classification task.

Related papers

Pre-trained Trojan Attacks for Visual Recognition [106.13792185398863]
Pre-trained vision models (PVMs) have become a dominant component due to their exceptional performance when fine-tuned for downstream tasks. We propose the Pre-trained Trojan attack, which embeds backdoors into a PVM, enabling attacks across various downstream vision tasks. We highlight the challenges posed by cross-task activation and shortcut connections in successful backdoor attacks.
arXiv Detail & Related papers (2023-12-23T05:51:40Z)
Poisoning Network Flow Classifiers [10.055241826257083]
This paper focuses on poisoning attacks, specifically backdoor attacks, against network traffic flow classifiers. We investigate the challenging scenario of clean-label poisoning where the adversary's capabilities are constrained to tampering only with the training data. We describe a trigger crafting strategy that leverages model interpretability techniques to generate trigger patterns that are effective even at very low poisoning rates.
arXiv Detail & Related papers (2023-06-02T16:24:15Z)
An Incremental Gray-box Physical Adversarial Attack on Neural Network Training [36.244907785240876]
We propose a gradient-free, gray box, incremental attack that targets the training process of neural networks. The proposed attack acquires its high-risk property from attacking data structures that are typically unobserved by professionals.
arXiv Detail & Related papers (2023-02-20T09:48:11Z)
Few-shot Backdoor Attacks via Neural Tangent Kernels [31.85706783674533]
In a backdoor attack, an attacker injects corrupted examples into the training set. Central to these attacks is the trade-off between the success rate of the attack and the number of corrupted training examples injected. We use neural tangent kernels to approximate the training dynamics of the model being attacked and automatically learn strong poison examples.
arXiv Detail & Related papers (2022-10-12T05:30:00Z)
An anomaly detection approach for backdoored neural networks: face recognition as a case study [77.92020418343022]
We propose a novel backdoored network detection method based on the principle of anomaly detection. We test our method on a novel dataset of backdoored networks and report detectability results with perfect scores.
arXiv Detail & Related papers (2022-08-22T12:14:13Z)
Downlink Power Allocation in Massive MIMO via Deep Learning: Adversarial Attacks and Training [62.77129284830945]
This paper considers a regression problem in a wireless setting and shows that adversarial attacks can break the DL-based approach. We also analyze the effectiveness of adversarial training as a defensive technique in adversarial settings and show that the robustness of DL-based wireless system against attacks improves significantly.
arXiv Detail & Related papers (2022-06-14T04:55:11Z)
The Feasibility and Inevitability of Stealth Attacks [63.14766152741211]
We study new adversarial perturbations that enable an attacker to gain control over decisions in generic Artificial Intelligence systems. In contrast to adversarial data modification, the attack mechanism we consider here involves alterations to the AI system itself.
arXiv Detail & Related papers (2021-06-26T10:50:07Z)
Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch [99.90716010490625]
Backdoor attackers tamper with training data to embed a vulnerability in models that are trained on that data. This vulnerability is then activated at inference time by placing a "trigger" into the model's input. We develop a new hidden trigger attack, Sleeper Agent, which employs gradient matching, data selection, and target model re-training during the crafting process.
arXiv Detail & Related papers (2021-06-16T17:09:55Z)
Combating Adversaries with Anti-Adversaries [118.70141983415445]
In particular, our layer generates an input perturbation in the opposite direction of the adversarial one. We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models. Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.
arXiv Detail & Related papers (2021-03-26T09:36:59Z)
Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural Networks [25.23881974235643]
We show that backdoor attacks induce a smoother decision function around the triggered samples -- a phenomenon which we refer to as textitbackdoor smoothing. Our experiments show that smoothness increases when the trigger is added to the input samples, and that this phenomenon is more pronounced for more successful attacks.
arXiv Detail & Related papers (2020-06-11T18:28:54Z)
Adversarial Feature Desensitization [12.401175943131268]
We propose a novel approach to adversarial robustness, which builds upon the insights from the domain adaptation field. Our method, called Adversarial Feature Desensitization (AFD), aims at learning features that are invariant towards adversarial perturbations of the inputs.
arXiv Detail & Related papers (2020-06-08T14:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.