Dynamic Byzantine-Robust Learning: Adapting to Switching Byzantine Workers
- URL: http://arxiv.org/abs/2402.02951v2
- Date: Sun, 16 Jun 2024 05:08:35 GMT
- Title: Dynamic Byzantine-Robust Learning: Adapting to Switching Byzantine Workers
- Authors: Ron Dorfman, Naseem Yehya, Kfir Y. Levy,
- Abstract summary: Byzantine-robust learning has emerged as a prominent fault-tolerant distributed machine learning framework.
We propose DynaBRO -- a new method capable of withstanding any sub-linear number of identity changes across rounds.
Our method utilizes a multi-level Monte Carlo gradient technique applied at the server to robustly aggregated worker updates.
- Score: 10.632248569865236
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Byzantine-robust learning has emerged as a prominent fault-tolerant distributed machine learning framework. However, most techniques focus on the static setting, wherein the identity of Byzantine workers remains unchanged throughout the learning process. This assumption fails to capture real-world dynamic Byzantine behaviors, which may include intermittent malfunctions or targeted, time-limited attacks. Addressing this limitation, we propose DynaBRO -- a new method capable of withstanding any sub-linear number of identity changes across rounds. Specifically, when the number of such changes is $\mathcal{O}(\sqrt{T})$ (where $T$ is the total number of training rounds), DynaBRO nearly matches the state-of-the-art asymptotic convergence rate of the static setting. Our method utilizes a multi-level Monte Carlo (MLMC) gradient estimation technique applied at the server to robustly aggregated worker updates. By additionally leveraging an adaptive learning rate, we circumvent the need for prior knowledge of the fraction of Byzantine workers.
Related papers
- Byzantine-Robust and Communication-Efficient Distributed Learning via Compressed Momentum Filtering [17.446431849022346]
Distributed learning has become the standard approach for training large-scale machine learning models across private data silos.
It faces critical challenges related to robustness and communication preservation.
We propose a novel Byzantine-robust and communication-efficient distributed learning method.
arXiv Detail & Related papers (2024-09-13T08:53:10Z) - Byzantine Robustness and Partial Participation Can Be Achieved at Once: Just Clip Gradient Differences [61.74021364776313]
Distributed learning has emerged as a leading paradigm for training large machine learning models.
In real-world scenarios, participants may be unreliable or malicious, posing a significant challenge to the integrity and accuracy of the trained models.
We propose the first distributed method with client sampling and provable tolerance to Byzantine workers.
arXiv Detail & Related papers (2023-11-23T17:50:30Z) - Action-Quantized Offline Reinforcement Learning for Robotic Skill
Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data.
In this paper, we propose an adaptive scheme for action quantization.
We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z) - Byzantine-Resilient Learning Beyond Gradients: Distributing Evolutionary
Search [6.461473289206789]
We show that gradient-free ML algorithms can be combined with classical distributed consensus algorithms to generate gradient-free byzantine-resilient distributed learning algorithms.
We provide proofs and pseudo-code for two specific cases - the Total Order Broadcast and proof-of-work leader election.
arXiv Detail & Related papers (2023-04-20T17:13:29Z) - A Robust Classification Framework for Byzantine-Resilient Stochastic
Gradient Descent [3.5450828190071655]
This paper proposes a Robust Gradient Classification Framework (RGCF) for Byzantine fault tolerance in distributed gradient descent.
RGCF is not dependent on the number of workers; it can scale up to training instances with a large number of workers without a loss in performance.
arXiv Detail & Related papers (2023-01-16T10:40:09Z) - Few-Shot Class-Incremental Learning by Sampling Multi-Phase Tasks [59.12108527904171]
A model should recognize new classes and maintain discriminability over old classes.
The task of recognizing few-shot new classes without forgetting old classes is called few-shot class-incremental learning (FSCIL)
We propose a new paradigm for FSCIL based on meta-learning by LearnIng Multi-phase Incremental Tasks (LIMIT)
arXiv Detail & Related papers (2022-03-31T13:46:41Z) - Stochastic Alternating Direction Method of Multipliers for
Byzantine-Robust Distributed Learning [22.835940007753376]
We propose a Byzantine-robust alternating direction method of multipliers (ADMM) that fully utilizes the separable problem structure.
Theoretically, we prove that the proposed method converges to a bounded neighborhood of the optimal solution at a rate of O(k) under mild assumptions.
Numerical experiments on the MNIST and COVERTYPE datasets demonstrate the effectiveness of the proposed method to various Byzantine attacks.
arXiv Detail & Related papers (2021-06-13T01:17:31Z) - Byzantine-Robust Variance-Reduced Federated Learning over Distributed
Non-i.i.d. Data [36.99547890386817]
We consider the federated learning problem where data on workers are not independent and identically distributed (i.i.d.)
An unknown number of Byzantine workers may send malicious messages to the central node, leading to remarkable learning error.
Most of the Byzantine-robust methods address this issue by using robust aggregation rules to aggregate the received messages.
arXiv Detail & Related papers (2020-09-17T09:09:23Z) - Byzantine-resilient Decentralized Stochastic Gradient Descent [85.15773446094576]
We present an in-depth study towards the Byzantine resilience of decentralized learning systems.
We propose UBAR, a novel algorithm to enhance decentralized learning with Byzantine Fault Tolerance.
arXiv Detail & Related papers (2020-02-20T05:11:04Z) - A Neural Dirichlet Process Mixture Model for Task-Free Continual
Learning [48.87397222244402]
We propose an expansion-based approach for task-free continual learning.
Our model successfully performs task-free continual learning for both discriminative and generative tasks.
arXiv Detail & Related papers (2020-01-03T02:07:31Z) - Federated Variance-Reduced Stochastic Gradient Descent with Robustness
to Byzantine Attacks [74.36161581953658]
This paper deals with distributed finite-sum optimization for learning over networks in the presence of malicious Byzantine attacks.
To cope with such attacks, most resilient approaches so far combine gradient descent (SGD) with different robust aggregation rules.
The present work puts forth a Byzantine attack resilient distributed (Byrd-) SAGA approach for learning tasks involving finite-sum optimization over networks.
arXiv Detail & Related papers (2019-12-29T19:46:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.