Detection and Mitigation of Byzantine Attacks in Distributed Training
- URL: http://arxiv.org/abs/2208.08085v4
- Date: Sat, 13 May 2023 18:08:08 GMT
- Title: Detection and Mitigation of Byzantine Attacks in Distributed Training
- Authors: Konstantinos Konstantinidis, Namrata Vaswani, and Aditya Ramamoorthy
- Abstract summary: An abnormal Byzantine behavior of the worker nodes can derail the training and compromise the quality of the inference.
Recent work considers a wide range of attack models and has explored robust aggregation and/or computational redundancy to correct the distorted gradients.
In this work, we consider attack models ranging from strong ones: $q$ omniscient adversaries with full knowledge of the defense protocol that can change from iteration to iteration to weak ones: $q$ randomly chosen adversaries with limited collusion abilities.
- Score: 24.951227624475443
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A plethora of modern machine learning tasks require the utilization of
large-scale distributed clusters as a critical component of the training
pipeline. However, abnormal Byzantine behavior of the worker nodes can derail
the training and compromise the quality of the inference. Such behavior can be
attributed to unintentional system malfunctions or orchestrated attacks; as a
result, some nodes may return arbitrary results to the parameter server (PS)
that coordinates the training. Recent work considers a wide range of attack
models and has explored robust aggregation and/or computational redundancy to
correct the distorted gradients.
In this work, we consider attack models ranging from strong ones: $q$
omniscient adversaries with full knowledge of the defense protocol that can
change from iteration to iteration to weak ones: $q$ randomly chosen
adversaries with limited collusion abilities which only change every few
iterations at a time. Our algorithms rely on redundant task assignments coupled
with detection of adversarial behavior. We also show the convergence of our
method to the optimal point under common assumptions and settings considered in
literature. For strong attacks, we demonstrate a reduction in the fraction of
distorted gradients ranging from 16%-99% as compared to the prior
state-of-the-art. Our top-1 classification accuracy results on the CIFAR-10
data set demonstrate 25% advantage in accuracy (averaged over strong and weak
scenarios) under the most sophisticated attacks compared to state-of-the-art
methods.
Related papers
- Purify Unlearnable Examples via Rate-Constrained Variational Autoencoders [101.42201747763178]
Unlearnable examples (UEs) seek to maximize testing error by making subtle modifications to training examples that are correctly labeled.
Our work provides a novel disentanglement mechanism to build an efficient pre-training purification method.
arXiv Detail & Related papers (2024-05-02T16:49:25Z) - Wasserstein distributional robustness of neural networks [9.79503506460041]
Deep neural networks are known to be vulnerable to adversarial attacks (AA)
For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified.
We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions.
arXiv Detail & Related papers (2023-06-16T13:41:24Z) - Distributed Adversarial Training to Robustify Deep Neural Networks at
Scale [100.19539096465101]
Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification.
To defend against such attacks, an effective approach, known as adversarial training (AT), has been shown to mitigate robust training.
We propose a large-batch adversarial training framework implemented over multiple machines.
arXiv Detail & Related papers (2022-06-13T15:39:43Z) - Fast Adversarial Training with Adaptive Step Size [62.37203478589929]
We study the phenomenon from the perspective of training instances.
We propose a simple but effective method, Adversarial Training with Adaptive Step size (ATAS)
ATAS learns an instancewise adaptive step size that is inversely proportional to its gradient norm.
arXiv Detail & Related papers (2022-06-06T08:20:07Z) - Aspis: A Robust Detection System for Distributed Learning [13.90938823562779]
Machine learning systems can be compromised when some of the computing devices exhibit abnormal (Byzantine) behavior.
Our proposed method Aspis assigns gradient computations to worker nodes using a subset-based assignment.
We prove the Byzantine resilience and detection guarantees of Aspis under weak and strong attacks and extensively evaluate the system on various large-scale training scenarios.
arXiv Detail & Related papers (2021-08-05T07:24:38Z) - Adaptive Feature Alignment for Adversarial Training [56.17654691470554]
CNNs are typically vulnerable to adversarial attacks, which pose a threat to security-sensitive applications.
We propose the adaptive feature alignment (AFA) to generate features of arbitrary attacking strengths.
Our method is trained to automatically align features of arbitrary attacking strength.
arXiv Detail & Related papers (2021-05-31T17:01:05Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - ByzShield: An Efficient and Robust System for Distributed Training [12.741811850885309]
In this work we consider an omniscient attack model where the adversary has full knowledge about the gradient assignments of the workers.
Our redundancy-based method ByzShield leverages the properties of bipartite expander graphs for the assignment of tasks to workers.
Our experiments on training followed by image classification on the CIFAR-10 dataset show that ByzShield has on average a 20% advantage in accuracy under the most sophisticated attacks.
arXiv Detail & Related papers (2020-10-10T04:41:53Z) - Adversarial Detection and Correction by Matching Prediction
Distributions [0.0]
The detector almost completely neutralises powerful attacks like Carlini-Wagner or SLIDE on MNIST and Fashion-MNIST.
We show that our method is still able to detect the adversarial examples in the case of a white-box attack where the attacker has full knowledge of both the model and the defence.
arXiv Detail & Related papers (2020-02-21T15:45:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.