Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing
- URL: http://arxiv.org/abs/2006.09365v6
- Date: Wed, 22 Nov 2023 09:08:15 GMT
- Title: Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing
- Authors: Sai Praneeth Karimireddy, Lie He, Martin Jaggi
- Abstract summary: In Byzantine robust distributed learning, a central server wants to train a machine learning model over data distributed across multiple workers.
A fraction of these workers may deviate from the prescribed algorithm and send arbitrary messages.
We propose a simple bucketing scheme that adapts existing robust algorithms to heterogeneous datasets at a negligible computational cost.
- Score: 55.012801269326594
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Byzantine robust distributed or federated learning, a central server wants
to train a machine learning model over data distributed across multiple
workers. However, a fraction of these workers may deviate from the prescribed
algorithm and send arbitrary messages. While this problem has received
significant attention recently, most current defenses assume that the workers
have identical data. For realistic cases when the data across workers are
heterogeneous (non-iid), we design new attacks which circumvent current
defenses, leading to significant loss of performance. We then propose a simple
bucketing scheme that adapts existing robust algorithms to heterogeneous
datasets at a negligible computational cost. We also theoretically and
experimentally validate our approach, showing that combining bucketing with
existing robust algorithms is effective against challenging attacks. Our work
is the first to establish guaranteed convergence for the non-iid Byzantine
robust problem under realistic assumptions.
Related papers
- A Linearly Convergent GAN Inversion-based Algorithm for Reverse
Engineering of Deceptions [1.2891210250935146]
We propose a novel framework for reverse engineering of deceptions that supposes that the clean data lies in the range of a GAN.
For the first time in the literature, we provide deterministic linear convergence guarantees for this problem.
arXiv Detail & Related papers (2023-06-07T20:08:27Z) - Towards Robust Dataset Learning [90.2590325441068]
We propose a principled, tri-level optimization to formulate the robust dataset learning problem.
Under an abstraction model that characterizes robust vs. non-robust features, the proposed method provably learns a robust dataset.
arXiv Detail & Related papers (2022-11-19T17:06:10Z) - Robust Distributed Learning Against Both Distributional Shifts and
Byzantine Attacks [29.34471516011148]
In distributed learning systems, issues may arise from two sources.
On one hand, due to distributional shifts between training data and test data, the model could exhibit poor out-of-sample performance.
On the other hand, a portion of trained nodes might be subject to byzantine attacks which could invalidate the model.
arXiv Detail & Related papers (2022-10-29T20:08:07Z) - Securing Federated Learning against Overwhelming Collusive Attackers [7.587927338603662]
We propose two graph theoretic algorithms, based on Minimum Spanning Tree and k-Densest graph, by leveraging correlations between local models.
Our FL model can nullify the influence of attackers even when they are up to 70% of all the clients.
We establish the superiority of our algorithms over the existing ones using accuracy, attack success rate, and early detection round.
arXiv Detail & Related papers (2022-09-28T13:41:04Z) - Secure Distributed Training at Scale [65.7538150168154]
Training in presence of peers requires specialized distributed training algorithms with Byzantine tolerance.
We propose a novel protocol for secure (Byzantine-tolerant) decentralized training that emphasizes communication efficiency.
arXiv Detail & Related papers (2021-06-21T17:00:42Z) - Online Adversarial Attacks [57.448101834579624]
We formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases.
We first rigorously analyze a deterministic variant of the online threat model.
We then propose algoname, a simple yet practical algorithm yielding a provably better competitive ratio for $k=2$ over the current best single threshold algorithm.
arXiv Detail & Related papers (2021-03-02T20:36:04Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Faster Secure Data Mining via Distributed Homomorphic Encryption [108.77460689459247]
Homomorphic Encryption (HE) is receiving more and more attention recently for its capability to do computations over the encrypted field.
We propose a novel general distributed HE-based data mining framework towards one step of solving the scaling problem.
We verify the efficiency and effectiveness of our new framework by testing over various data mining algorithms and benchmark data-sets.
arXiv Detail & Related papers (2020-06-17T18:14:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.