Federated Variance-Reduced Stochastic Gradient Descent with Robustness
to Byzantine Attacks
- URL: http://arxiv.org/abs/1912.12716v2
- Date: Wed, 3 Feb 2021 07:34:10 GMT
- Title: Federated Variance-Reduced Stochastic Gradient Descent with Robustness
to Byzantine Attacks
- Authors: Zhaoxian Wu, Qing Ling, Tianyi Chen, and Georgios B. Giannakis
- Abstract summary: This paper deals with distributed finite-sum optimization for learning over networks in the presence of malicious Byzantine attacks.
To cope with such attacks, most resilient approaches so far combine gradient descent (SGD) with different robust aggregation rules.
The present work puts forth a Byzantine attack resilient distributed (Byrd-) SAGA approach for learning tasks involving finite-sum optimization over networks.
- Score: 74.36161581953658
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper deals with distributed finite-sum optimization for learning over
networks in the presence of malicious Byzantine attacks. To cope with such
attacks, most resilient approaches so far combine stochastic gradient descent
(SGD) with different robust aggregation rules. However, the sizeable
SGD-induced stochastic gradient noise makes it challenging to distinguish
malicious messages sent by the Byzantine attackers from noisy stochastic
gradients sent by the 'honest' workers. This motivates us to reduce the
variance of stochastic gradients as a means of robustifying SGD in the presence
of Byzantine attacks. To this end, the present work puts forth a Byzantine
attack resilient distributed (Byrd-) SAGA approach for learning tasks involving
finite-sum optimization over networks. Rather than the mean employed by
distributed SAGA, the novel Byrd- SAGA relies on the geometric median to
aggregate the corrected stochastic gradients sent by the workers. When less
than half of the workers are Byzantine attackers, the robustness of geometric
median to outliers enables Byrd-SAGA to attain provably linear convergence to a
neighborhood of the optimal solution, with the asymptotic learning error
determined by the number of Byzantine workers. Numerical tests corroborate the
robustness to various Byzantine attacks, as well as the merits of Byrd- SAGA
over Byzantine attack resilient distributed SGD.
Related papers
- Byzantine-Robust Loopless Stochastic Variance-Reduced Gradient [0.0]
We propose a new method -- Byzantine-Robust Loopless Variance Reduced Gradient (BR-LSVRG)
We derive non-asymptotic convergence guarantees for the new method in the strongly convex case.
arXiv Detail & Related papers (2023-03-08T13:20:49Z) - Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting [58.91947205027892]
Federated learning has exhibited vulnerabilities to Byzantine attacks.
Byzantine attackers can send arbitrary gradients to a central server to destroy the convergence and performance of the global model.
A wealth of robust AGgregation Rules (AGRs) have been proposed to defend against Byzantine attacks.
arXiv Detail & Related papers (2023-02-13T03:31:50Z) - A Robust Classification Framework for Byzantine-Resilient Stochastic
Gradient Descent [3.5450828190071655]
This paper proposes a Robust Gradient Classification Framework (RGCF) for Byzantine fault tolerance in distributed gradient descent.
RGCF is not dependent on the number of workers; it can scale up to training instances with a large number of workers without a loss in performance.
arXiv Detail & Related papers (2023-01-16T10:40:09Z) - Byzantine-Robust Online and Offline Distributed Reinforcement Learning [60.970950468309056]
We consider a distributed reinforcement learning setting where multiple agents explore the environment and communicate their experiences through a central server.
$alpha$-fraction of agents are adversarial and can report arbitrary fake information.
We seek to identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents.
arXiv Detail & Related papers (2022-06-01T00:44:53Z) - Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation.
ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations.
The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z) - Stochastic Alternating Direction Method of Multipliers for
Byzantine-Robust Distributed Learning [22.835940007753376]
We propose a Byzantine-robust alternating direction method of multipliers (ADMM) that fully utilizes the separable problem structure.
Theoretically, we prove that the proposed method converges to a bounded neighborhood of the optimal solution at a rate of O(k) under mild assumptions.
Numerical experiments on the MNIST and COVERTYPE datasets demonstrate the effectiveness of the proposed method to various Byzantine attacks.
arXiv Detail & Related papers (2021-06-13T01:17:31Z) - BROADCAST: Reducing Both Stochastic and Compression Noise to Robustify
Communication-Efficient Federated Learning [24.016538592246377]
Communication between workers and master node to collect local gradients is a key bottleneck in a large-scale learning system.
In this work, we investigate the problem of Byzantine-robust federated learning with compression, where the attacks from Byzantine workers can be arbitrarily malicious.
In light of this observation, we propose to jointly reduce the noise and compression noise so as to improve the Byzantine-robustness.
arXiv Detail & Related papers (2021-04-14T08:16:03Z) - Learning from History for Byzantine Robust Optimization [52.68913869776858]
Byzantine robustness has received significant attention recently given its importance for distributed learning.
We show that most existing robust aggregation rules may not converge even in the absence of any Byzantine attackers.
arXiv Detail & Related papers (2020-12-18T16:22:32Z) - Byzantine-Robust Decentralized Stochastic Optimization over Static and
Time-Varying Networks [25.15075119957447]
We consider the Byzantine-robust optimization problem defined over decentralized static and time-varying networks.
Some of the agents are unreliable due to data corruptions, equipment failures or cyber-attacks.
Our key idea to handle the Byzantine attacks is to formulate a total variation (TV) norm-penalized approximation of the Byzantine-free problem.
We prove that the proposed method reaches a neighborhood of the Byzantine-free optimal solution, and the size of neighborhood is determined by the number of Byzantine agents and the network topology.
arXiv Detail & Related papers (2020-05-12T04:18:39Z) - BERT-ATTACK: Adversarial Attack Against BERT Using BERT [77.82947768158132]
Adrial attacks for discrete data (such as texts) are more challenging than continuous data (such as images)
We propose textbfBERT-Attack, a high-quality and effective method to generate adversarial samples.
Our method outperforms state-of-the-art attack strategies in both success rate and perturb percentage.
arXiv Detail & Related papers (2020-04-21T13:30:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.