Related papers: Federated Variance-Reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks

Federated Variance-Reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks

URL: http://arxiv.org/abs/1912.12716v2
Date: Wed, 3 Feb 2021 07:34:10 GMT
Title: Federated Variance-Reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks
Authors: Zhaoxian Wu, Qing Ling, Tianyi Chen, and Georgios B. Giannakis
Abstract summary: This paper deals with distributed finite-sum optimization for learning over networks in the presence of malicious Byzantine attacks. To cope with such attacks, most resilient approaches so far combine gradient descent (SGD) with different robust aggregation rules. The present work puts forth a Byzantine attack resilient distributed (Byrd-) SAGA approach for learning tasks involving finite-sum optimization over networks.
Score: 74.36161581953658
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper deals with distributed finite-sum optimization for learning over networks in the presence of malicious Byzantine attacks. To cope with such attacks, most resilient approaches so far combine stochastic gradient descent (SGD) with different robust aggregation rules. However, the sizeable SGD-induced stochastic gradient noise makes it challenging to distinguish malicious messages sent by the Byzantine attackers from noisy stochastic gradients sent by the 'honest' workers. This motivates us to reduce the variance of stochastic gradients as a means of robustifying SGD in the presence of Byzantine attacks. To this end, the present work puts forth a Byzantine attack resilient distributed (Byrd-) SAGA approach for learning tasks involving finite-sum optimization over networks. Rather than the mean employed by distributed SAGA, the novel Byrd- SAGA relies on the geometric median to aggregate the corrected stochastic gradients sent by the workers. When less than half of the workers are Byzantine attackers, the robustness of geometric median to outliers enables Byrd-SAGA to attain provably linear convergence to a neighborhood of the optimal solution, with the asymptotic learning error determined by the number of Byzantine workers. Numerical tests corroborate the robustness to various Byzantine attacks, as well as the merits of Byrd- SAGA over Byzantine attack resilient distributed SGD.

Related papers

Exploit Gradient Skewness to Circumvent Byzantine Defenses for Federated Learning [54.36263862306616]
Federated Learning (FL) is notorious for its vulnerability to Byzantine attacks. Most current Byzantine defenses share a common inductive bias: among all the gradients, the densely distributed ones are more likely to be honest. We discover that a group of densely distributed honest gradients skew away from the optimal gradient due to heterogeneous data. We propose a novel skew-aware attack called STRIKE: first, we search for the skewed gradients; then, we construct Byzantine gradients within the skewed gradients.
arXiv Detail & Related papers (2025-02-07T12:56:39Z)
Byzantine-Robust Loopless Stochastic Variance-Reduced Gradient [0.0]
We propose a new method -- Byzantine-Robust Loopless Variance Reduced Gradient (BR-LSVRG) We derive non-asymptotic convergence guarantees for the new method in the strongly convex case.
arXiv Detail & Related papers (2023-03-08T13:20:49Z)
Byzantine-Robust Learning on Heterogeneous Data via Gradient Splitting [58.91947205027892]
Federated learning has exhibited vulnerabilities to Byzantine attacks. Byzantine attackers can send arbitrary gradients to a central server to destroy the convergence and performance of the global model. A wealth of robust AGgregation Rules (AGRs) have been proposed to defend against Byzantine attacks.
arXiv Detail & Related papers (2023-02-13T03:31:50Z)
A Robust Classification Framework for Byzantine-Resilient Stochastic Gradient Descent [3.5450828190071655]
This paper proposes a Robust Gradient Classification Framework (RGCF) for Byzantine fault tolerance in distributed gradient descent. RGCF is not dependent on the number of workers; it can scale up to training instances with a large number of workers without a loss in performance.
arXiv Detail & Related papers (2023-01-16T10:40:09Z)
Byzantine-Robust Online and Offline Distributed Reinforcement Learning [60.970950468309056]
We consider a distributed reinforcement learning setting where multiple agents explore the environment and communicate their experiences through a central server. $alpha$-fraction of agents are adversarial and can report arbitrary fake information. We seek to identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents.
arXiv Detail & Related papers (2022-06-01T00:44:53Z)
Discriminator-Free Generative Adversarial Attack [87.71852388383242]
Agenerative-based adversarial attacks can get rid of this limitation. ASymmetric Saliency-based Auto-Encoder (SSAE) generates the perturbations. The adversarial examples generated by SSAE not only make thewidely-used models collapse, but also achieves good visual quality.
arXiv Detail & Related papers (2021-07-20T01:55:21Z)
Stochastic Alternating Direction Method of Multipliers for Byzantine-Robust Distributed Learning [22.835940007753376]
We propose a Byzantine-robust alternating direction method of multipliers (ADMM) that fully utilizes the separable problem structure. Theoretically, we prove that the proposed method converges to a bounded neighborhood of the optimal solution at a rate of O(k) under mild assumptions. Numerical experiments on the MNIST and COVERTYPE datasets demonstrate the effectiveness of the proposed method to various Byzantine attacks.
arXiv Detail & Related papers (2021-06-13T01:17:31Z)
BROADCAST: Reducing Both Stochastic and Compression Noise to Robustify Communication-Efficient Federated Learning [24.016538592246377]
Communication between workers and master node to collect local gradients is a key bottleneck in a large-scale learning system. In this work, we investigate the problem of Byzantine-robust federated learning with compression, where the attacks from Byzantine workers can be arbitrarily malicious. In light of this observation, we propose to jointly reduce the noise and compression noise so as to improve the Byzantine-robustness.
arXiv Detail & Related papers (2021-04-14T08:16:03Z)
Learning from History for Byzantine Robust Optimization [52.68913869776858]
Byzantine robustness has received significant attention recently given its importance for distributed learning. We show that most existing robust aggregation rules may not converge even in the absence of any Byzantine attackers.
arXiv Detail & Related papers (2020-12-18T16:22:32Z)
Byzantine-Robust Decentralized Stochastic Optimization over Static and Time-Varying Networks [25.15075119957447]
We consider the Byzantine-robust optimization problem defined over decentralized static and time-varying networks. Some of the agents are unreliable due to data corruptions, equipment failures or cyber-attacks. Our key idea to handle the Byzantine attacks is to formulate a total variation (TV) norm-penalized approximation of the Byzantine-free problem. We prove that the proposed method reaches a neighborhood of the Byzantine-free optimal solution, and the size of neighborhood is determined by the number of Byzantine agents and the network topology.
arXiv Detail & Related papers (2020-05-12T04:18:39Z)
BERT-ATTACK: Adversarial Attack Against BERT Using BERT [77.82947768158132]
Adrial attacks for discrete data (such as texts) are more challenging than continuous data (such as images) We propose textbfBERT-Attack, a high-quality and effective method to generate adversarial samples. Our method outperforms state-of-the-art attack strategies in both success rate and perturb percentage.
arXiv Detail & Related papers (2020-04-21T13:30:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.