Attack of the Tails: Yes, You Really Can Backdoor Federated Learning
- URL: http://arxiv.org/abs/2007.05084v1
- Date: Thu, 9 Jul 2020 21:50:54 GMT
- Title: Attack of the Tails: Yes, You Really Can Backdoor Federated Learning
- Authors: Hongyi Wang, Kartik Sreenivasan, Shashank Rajput, Harit Vishwakarma,
Saurabh Agarwal, Jy-yong Sohn, Kangwook Lee, Dimitris Papailiopoulos
- Abstract summary: Federated Learning (FL) lends itself to adversarial attacks in the form of backdoors during training.
An edge-case backdoor forces a model to misclassify on seemingly easy inputs that are however unlikely to be part of the training, or test data, i.e., they live on the tail of the input distribution.
We show how these edge-case backdoors can lead to unsavory failures and may have serious repercussions on fairness.
- Score: 21.06925263586183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to its decentralized nature, Federated Learning (FL) lends itself to
adversarial attacks in the form of backdoors during training. The goal of a
backdoor is to corrupt the performance of the trained model on specific
sub-tasks (e.g., by classifying green cars as frogs). A range of FL backdoor
attacks have been introduced in the literature, but also methods to defend
against them, and it is currently an open question whether FL systems can be
tailored to be robust against backdoors. In this work, we provide evidence to
the contrary. We first establish that, in the general case, robustness to
backdoors implies model robustness to adversarial examples, a major open
problem in itself. Furthermore, detecting the presence of a backdoor in a FL
model is unlikely assuming first order oracles or polynomial time. We couple
our theoretical results with a new family of backdoor attacks, which we refer
to as edge-case backdoors. An edge-case backdoor forces a model to misclassify
on seemingly easy inputs that are however unlikely to be part of the training,
or test data, i.e., they live on the tail of the input distribution. We explain
how these edge-case backdoors can lead to unsavory failures and may have
serious repercussions on fairness, and exhibit that with careful tuning at the
side of the adversary, one can insert them across a range of machine learning
tasks (e.g., image classification, OCR, text prediction, sentiment analysis).
Related papers
- Flatness-aware Sequential Learning Generates Resilient Backdoors [7.969181278996343]
Recently, backdoor attacks have become an emerging threat to the security of machine learning models.
This paper counters CF of backdoors by leveraging continual learning (CL) techniques.
We propose a novel framework, named Sequential Backdoor Learning (SBL), that can generate resilient backdoors.
arXiv Detail & Related papers (2024-07-20T03:30:05Z) - BaDExpert: Extracting Backdoor Functionality for Accurate Backdoor Input
Detection [42.021282816470794]
We present a novel defense, against backdoor attacks on Deep Neural Networks (DNNs)
Our defense falls within the category of post-development defenses that operate independently of how the model was generated.
We show the feasibility of devising highly accurate backdoor input detectors that filter out the backdoor inputs during model inference.
arXiv Detail & Related papers (2023-08-23T21:47:06Z) - Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation.
Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them.
We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z) - BackdoorBox: A Python Toolbox for Backdoor Learning [67.53987387581222]
This Python toolbox implements representative and advanced backdoor attacks and defenses.
It allows researchers and developers to easily implement and compare different methods on benchmark or their local datasets.
arXiv Detail & Related papers (2023-02-01T09:45:42Z) - Neurotoxin: Durable Backdoors in Federated Learning [73.82725064553827]
federated learning systems have an inherent vulnerability during their training to adversarial backdoor attacks.
We propose Neurotoxin, a simple one-line modification to existing backdoor attacks that acts by attacking parameters that are changed less in magnitude during training.
arXiv Detail & Related papers (2022-06-12T16:52:52Z) - Check Your Other Door! Establishing Backdoor Attacks in the Frequency
Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks.
We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z) - Turn the Combination Lock: Learnable Textual Backdoor Attacks via Word
Substitution [57.51117978504175]
Recent studies show that neural natural language processing (NLP) models are vulnerable to backdoor attacks.
Injected with backdoors, models perform normally on benign examples but produce attacker-specified predictions when the backdoor is activated.
We present invisible backdoors that are activated by a learnable combination of word substitution.
arXiv Detail & Related papers (2021-06-11T13:03:17Z) - Backdoor Learning: A Survey [75.59571756777342]
Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs)
Backdoor learning is an emerging and rapidly growing research area.
This paper presents the first comprehensive survey of this realm.
arXiv Detail & Related papers (2020-07-17T04:09:20Z) - Defending against Backdoors in Federated Learning with Robust Learning
Rate [25.74681620689152]
Federated learning (FL) allows a set of agents to collaboratively train a model without sharing their potentially sensitive data.
In a backdoor attack, an adversary tries to embed a backdoor functionality to the model during training that can later be activated to cause a desired misclassification.
We propose a lightweight defense that requires minimal change to the FL protocol.
arXiv Detail & Related papers (2020-07-07T23:38:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.