Related papers: Byzantine-Robust Decentralized Stochastic Optimization over Static and Time-Varying Networks

Byzantine-Robust Decentralized Stochastic Optimization over Static and Time-Varying Networks

URL: http://arxiv.org/abs/2005.06276v2
Date: Fri, 18 Dec 2020 04:41:36 GMT
Title: Byzantine-Robust Decentralized Stochastic Optimization over Static and Time-Varying Networks
Authors: Jie Peng, Weiyu Li, Qing Ling
Abstract summary: We consider the Byzantine-robust optimization problem defined over decentralized static and time-varying networks. Some of the agents are unreliable due to data corruptions, equipment failures or cyber-attacks. Our key idea to handle the Byzantine attacks is to formulate a total variation (TV) norm-penalized approximation of the Byzantine-free problem. We prove that the proposed method reaches a neighborhood of the Byzantine-free optimal solution, and the size of neighborhood is determined by the number of Byzantine agents and the network topology.
Score: 25.15075119957447
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we consider the Byzantine-robust stochastic optimization problem defined over decentralized static and time-varying networks, where the agents collaboratively minimize the summation of expectations of stochastic local cost functions, but some of the agents are unreliable due to data corruptions, equipment failures or cyber-attacks. The unreliable agents, which are called as Byzantine agents thereafter, can send faulty values to their neighbors and bias the optimization process. Our key idea to handle the Byzantine attacks is to formulate a total variation (TV) norm-penalized approximation of the Byzantine-free problem, where the penalty term forces the local models of regular agents to be close, but also allows the existence of outliers from the Byzantine agents. A stochastic subgradient method is applied to solve the penalized problem. We prove that the proposed method reaches a neighborhood of the Byzantine-free optimal solution, and the size of neighborhood is determined by the number of Byzantine agents and the network topology. Numerical experiments corroborate the theoretical analysis, as well as demonstrate the robustness of the proposed method to Byzantine attacks and its superior performance comparing to existing methods.

Related papers

Byzantine-Resilient Over-the-Air Federated Learning under Zero-Trust Architecture [68.83934802584899]
We propose a novel Byzantine-robust FL paradigm for over-the-air transmissions, referred to as federated learning with secure adaptive clustering (FedSAC) FedSAC aims to protect a portion of the devices from attacks through zero trust architecture (ZTA) based Byzantine identification and adaptive device clustering. Numerical results substantiate the superiority of the proposed FedSAC over existing methods in terms of both test accuracy and convergence rate.
arXiv Detail & Related papers (2025-03-24T01:56:30Z)
Byzantine Robustness and Partial Participation Can Be Achieved at Once: Just Clip Gradient Differences [61.74021364776313]
Distributed learning has emerged as a leading paradigm for training large machine learning models. In real-world scenarios, participants may be unreliable or malicious, posing a significant challenge to the integrity and accuracy of the trained models. We propose the first distributed method with client sampling and provable tolerance to Byzantine workers.
arXiv Detail & Related papers (2023-11-23T17:50:30Z)
Byzantine-Robust Loopless Stochastic Variance-Reduced Gradient [0.0]
We propose a new method -- Byzantine-Robust Loopless Variance Reduced Gradient (BR-LSVRG) We derive non-asymptotic convergence guarantees for the new method in the strongly convex case.
arXiv Detail & Related papers (2023-03-08T13:20:49Z)
Byzantine-Robust Online and Offline Distributed Reinforcement Learning [60.970950468309056]
We consider a distributed reinforcement learning setting where multiple agents explore the environment and communicate their experiences through a central server. $alpha$-fraction of agents are adversarial and can report arbitrary fake information. We seek to identify a near-optimal policy for the underlying Markov decision process in the presence of these adversarial agents.
arXiv Detail & Related papers (2022-06-01T00:44:53Z)
Robust Distributed Optimization With Randomly Corrupted Gradients [24.253191879453784]
We propose a first-order distributed optimization algorithm that is provably robust to Byzantine failures-arbitrary and potentially adversarial behavior. Our algorithm achieves order normalization and trustworthy statistical error convergence rates.
arXiv Detail & Related papers (2021-06-28T19:45:25Z)
Stochastic Alternating Direction Method of Multipliers for Byzantine-Robust Distributed Learning [22.835940007753376]
We propose a Byzantine-robust alternating direction method of multipliers (ADMM) that fully utilizes the separable problem structure. Theoretically, we prove that the proposed method converges to a bounded neighborhood of the optimal solution at a rate of O(k) under mild assumptions. Numerical experiments on the MNIST and COVERTYPE datasets demonstrate the effectiveness of the proposed method to various Byzantine attacks.
arXiv Detail & Related papers (2021-06-13T01:17:31Z)
Escaping Saddle Points in Distributed Newton's Method with Communication efficiency and Byzantine Resilience [49.379254729853756]
We consider the problem of optimizing a non-regularized loss function (with saddle points) in a distributed framework in the presence of Byzantine machines. We robustify the cubic-regularized Newton algorithm such that it avoids the saddle points and the fake local minimas efficiently. We obtain theoretical guarantees for our proposed scheme under several approximate settings including (sub-sampled) and Hessians.
arXiv Detail & Related papers (2021-03-17T03:53:58Z)
Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which. machines can independently compute gradients, and cooperate. Our algorithm is based on a new concentration technique, and its sample complexity. It is very practical: it improves upon the performance of all prior methods when no. setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z)
Learning from History for Byzantine Robust Optimization [52.68913869776858]
Byzantine robustness has received significant attention recently given its importance for distributed learning. We show that most existing robust aggregation rules may not converge even in the absence of any Byzantine attackers.
arXiv Detail & Related papers (2020-12-18T16:22:32Z)
Federated Variance-Reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks [74.36161581953658]
This paper deals with distributed finite-sum optimization for learning over networks in the presence of malicious Byzantine attacks. To cope with such attacks, most resilient approaches so far combine gradient descent (SGD) with different robust aggregation rules. The present work puts forth a Byzantine attack resilient distributed (Byrd-) SAGA approach for learning tasks involving finite-sum optimization over networks.
arXiv Detail & Related papers (2019-12-29T19:46:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.