Related papers: Byzantine-Resilient Learning Beyond Gradients: Distributing Evolutionary Search

Byzantine-Resilient Learning Beyond Gradients: Distributing Evolutionary Search

URL: http://arxiv.org/abs/2304.13540v1
Date: Thu, 20 Apr 2023 17:13:29 GMT
Title: Byzantine-Resilient Learning Beyond Gradients: Distributing Evolutionary Search
Authors: Andrei Kucharavy, Matteo Monti, Rachid Guerraoui and Ljiljana Dolamic
Abstract summary: We show that gradient-free ML algorithms can be combined with classical distributed consensus algorithms to generate gradient-free byzantine-resilient distributed learning algorithms. We provide proofs and pseudo-code for two specific cases - the Total Order Broadcast and proof-of-work leader election.
Score: 6.461473289206789
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern machine learning (ML) models are capable of impressive performances. However, their prowess is not due only to the improvements in their architecture and training algorithms but also to a drastic increase in computational power used to train them. Such a drastic increase led to a growing interest in distributed ML, which in turn made worker failures and adversarial attacks an increasingly pressing concern. While distributed byzantine resilient algorithms have been proposed in a differentiable setting, none exist in a gradient-free setting. The goal of this work is to address this shortcoming. For that, we introduce a more general definition of byzantine-resilience in ML - the \textit{model-consensus}, that extends the definition of the classical distributed consensus. We then leverage this definition to show that a general class of gradient-free ML algorithms - ($1,\lambda$)-Evolutionary Search - can be combined with classical distributed consensus algorithms to generate gradient-free byzantine-resilient distributed learning algorithms. We provide proofs and pseudo-code for two specific cases - the Total Order Broadcast and proof-of-work leader election.

Related papers

A Mirror Descent-Based Algorithm for Corruption-Tolerant Distributed Gradient Descent [57.64826450787237]
We show how to analyze the behavior of distributed gradient descent algorithms in the presence of adversarial corruptions. We show how to use ideas from (lazy) mirror descent to design a corruption-tolerant distributed optimization algorithm. Experiments based on linear regression, support vector classification, and softmax classification on the MNIST dataset corroborate our theoretical findings.
arXiv Detail & Related papers (2024-07-19T08:29:12Z)
A Hard-to-Beat Baseline for Training-free CLIP-based Adaptation [121.0693322732454]
Contrastive Language-Image Pretraining (CLIP) has gained popularity for its remarkable zero-shot capacity. Recent research has focused on developing efficient fine-tuning methods to enhance CLIP's performance in downstream tasks. We revisit a classical algorithm, Gaussian Discriminant Analysis (GDA), and apply it to the downstream classification of CLIP.
arXiv Detail & Related papers (2024-02-06T15:45:27Z)
Learning a Diffusion Model Policy from Rewards via Q-Score Matching [93.0191910132874]
We present a theoretical framework linking the structure of diffusion model policies to a learned Q-function. We propose a new policy update method from this theory, which we denote Q-score matching.
arXiv Detail & Related papers (2023-12-18T23:31:01Z)
Lattice-Based Methods Surpass Sum-of-Squares in Clustering [98.46302040220395]
Clustering is a fundamental primitive in unsupervised learning. Recent work has established lower bounds against the class of low-degree methods. We show that, perhaps surprisingly, this particular clustering model textitdoes not exhibit a statistical-to-computational gap.
arXiv Detail & Related papers (2021-12-07T18:50:17Z)
Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states. Our method is widely applicable to classical DP-based inference. It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z)
Self-Supervised Learning by Estimating Twin Class Distributions [26.7828253129684]
We present TWIST, a novel self-supervised representation learning method by classifying large-scale unlabeled datasets in an end-to-end way. We employ a siamese network terminated by a softmax operation to produce twin class distributions of two augmented images. Specifically, we minimize the entropy of the distribution for each sample to make the class prediction for each sample and maximize the entropy of the mean distribution to make the predictions of different samples diverse.
arXiv Detail & Related papers (2021-10-14T14:39:39Z)
Screening for Sparse Online Learning [11.523471275501855]
Sparsity promoting regularizers are widely used to impose low-complexity structure (e.g. l1-norm for sparsity) to the regression coefficients of supervised learning. Most online algorithms do not have the property owing to the vanishing step-size and non-vanishing variance. We show how to eliminate useless features of the iterates generated by online algorithms, and thereby enforce finite activity identification.
arXiv Detail & Related papers (2021-01-18T10:40:47Z)
Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which. machines can independently compute gradients, and cooperate. Our algorithm is based on a new concentration technique, and its sample complexity. It is very practical: it improves upon the performance of all prior methods when no. setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z)
ByGARS: Byzantine SGD with Arbitrary Number of Attackers [8.987350240289757]
We propose two novel computation algorithms for distributed machine learning in the presence of Byzantine adversaries. In these algorithms, reputation scores workers are computed using an auxiliary dataset at the server. We show that the proposed algorithms are robust to multiple different types of attacks at the same time.
arXiv Detail & Related papers (2020-06-24T01:47:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.