Learning from History for Byzantine Robust Optimization
- URL: http://arxiv.org/abs/2012.10333v1
- Date: Fri, 18 Dec 2020 16:22:32 GMT
- Title: Learning from History for Byzantine Robust Optimization
- Authors: Sai Praneeth Karimireddy, Lie He, Martin Jaggi
- Abstract summary: Byzantine robustness has received significant attention recently given its importance for distributed learning.
We show that most existing robust aggregation rules may not converge even in the absence of any Byzantine attackers.
- Score: 52.68913869776858
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Byzantine robustness has received significant attention recently given its
importance for distributed and federated learning. In spite of this, we
identify severe flaws in existing algorithms even when the data across the
participants is assumed to be identical. First, we show that most existing
robust aggregation rules may not converge even in the absence of any Byzantine
attackers, because they are overly sensitive to the distribution of the noise
in the stochastic gradients. Secondly, we show that even if the aggregation
rules may succeed in limiting the influence of the attackers in a single round,
the attackers can couple their attacks across time eventually leading to
divergence. To address these issues, we present two surprisingly simple
strategies: a new iterative clipping procedure, and incorporating worker
momentum to overcome time-coupled attacks. This is the first provably robust
method for the standard stochastic non-convex optimization setting.
Related papers
- FedRISE: Rating Induced Sign Election of Gradients for Byzantine Tolerant Federated Aggregation [5.011091042850546]
We develop a robust aggregator called FedRISE for cross-silo FL that is consistent and less susceptible to poisoning updates by an omniscient attacker.
We compare our method against 8 robust aggregators under 6 poisoning attacks on 3 datasets and architectures.
Our results show that existing robust aggregators collapse for at least some attacks under severe settings, while FedRISE demonstrates better robustness because of a stringent gradient inclusion formulation.
arXiv Detail & Related papers (2024-11-06T12:14:11Z) - Byzantine-Robust and Communication-Efficient Distributed Learning via Compressed Momentum Filtering [17.446431849022346]
Distributed learning has become the standard approach for training large-scale machine learning models across private data silos.
It faces critical challenges related to robustness and communication preservation.
We propose a novel Byzantine-robust and communication-efficient distributed learning method.
arXiv Detail & Related papers (2024-09-13T08:53:10Z) - Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations.
We transform the multi-granular attack into a sequential decision-making process.
Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - Robust Distributed Learning Against Both Distributional Shifts and
Byzantine Attacks [29.34471516011148]
In distributed learning systems, issues may arise from two sources.
On one hand, due to distributional shifts between training data and test data, the model could exhibit poor out-of-sample performance.
On the other hand, a portion of trained nodes might be subject to byzantine attacks which could invalidate the model.
arXiv Detail & Related papers (2022-10-29T20:08:07Z) - Robust Distributed Optimization With Randomly Corrupted Gradients [24.253191879453784]
We propose a first-order distributed optimization algorithm that is provably robust to Byzantine failures-arbitrary and potentially adversarial behavior.
Our algorithm achieves order normalization and trustworthy statistical error convergence rates.
arXiv Detail & Related papers (2021-06-28T19:45:25Z) - Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which.
machines can independently compute gradients, and cooperate.
Our algorithm is based on a new concentration technique, and its sample complexity.
It is very practical: it improves upon the performance of all prior methods when no.
setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z) - Hidden Cost of Randomized Smoothing [72.93630656906599]
In this paper, we point out the side effects of current randomized smoothing.
Specifically, we articulate and prove two major points: 1) the decision boundaries of smoothed classifiers will shrink, resulting in disparity in class-wise accuracy; 2) applying noise augmentation in the training process does not necessarily resolve the shrinking issue due to the inconsistent learning objectives.
arXiv Detail & Related papers (2020-03-02T23:37:42Z) - Temporal Sparse Adversarial Attack on Sequence-based Gait Recognition [56.844587127848854]
We demonstrate that the state-of-the-art gait recognition model is vulnerable to such attacks.
We employ a generative adversarial network based architecture to semantically generate adversarial high-quality gait silhouettes or video frames.
The experimental results show that if only one-fortieth of the frames are attacked, the accuracy of the target model drops dramatically.
arXiv Detail & Related papers (2020-02-22T10:08:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.