LEGATO: A LayerwisE Gradient AggregaTiOn Algorithm for Mitigating
Byzantine Attacks in Federated Learning
- URL: http://arxiv.org/abs/2107.12490v1
- Date: Mon, 26 Jul 2021 21:34:45 GMT
- Title: LEGATO: A LayerwisE Gradient AggregaTiOn Algorithm for Mitigating
Byzantine Attacks in Federated Learning
- Authors: Kamala Varma, Yi Zhou, Nathalie Baracaldo, Ali Anwar
- Abstract summary: Federated learning has arisen as a mechanism to allow multiple participants to collaboratively train a model without sharing their data.
We introduce LayerwisE Gradient AggregatTiOn (LEGATO), an aggregation algorithm that is, by contrast, scalable and generalizable.
We show that LEGATO is more computationally efficient than multiple state-of-the-art techniques and more generally robust across a variety of attack settings in practice.
- Score: 10.667821026727573
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning has arisen as a mechanism to allow multiple participants
to collaboratively train a model without sharing their data. In these settings,
participants (workers) may not trust each other fully; for instance, a set of
competitors may collaboratively train a machine learning model to detect fraud.
The workers provide local gradients that a central server uses to update a
global model. This global model can be corrupted when Byzantine workers send
malicious gradients, which necessitates robust methods for aggregating
gradients that mitigate the adverse effects of Byzantine inputs. Existing
robust aggregation algorithms are often computationally expensive and only
effective under strict assumptions. In this paper, we introduce LayerwisE
Gradient AggregatTiOn (LEGATO), an aggregation algorithm that is, by contrast,
scalable and generalizable. Informed by a study of layer-specific responses of
gradients to Byzantine attacks, LEGATO employs a dynamic gradient reweighing
scheme that is novel in its treatment of gradients based on layer-specific
robustness. We show that LEGATO is more computationally efficient than multiple
state-of-the-art techniques and more generally robust across a variety of
attack settings in practice. We also demonstrate LEGATO's benefits for gradient
descent convergence in the absence of an attack.
Related papers
- Language Models as Zero-shot Lossless Gradient Compressors: Towards
General Neural Parameter Prior Models [66.1595537904019]
Large language models (LLMs) can act as gradient priors in a zero-shot setting.
We introduce LM-GC, a novel method that integrates LLMs with arithmetic coding.
arXiv Detail & Related papers (2024-09-26T13:38:33Z) - Client-side Gradient Inversion Against Federated Learning from Poisoning [59.74484221875662]
Federated Learning (FL) enables distributed participants to train a global model without sharing data directly to a central server.
Recent studies have revealed that FL is vulnerable to gradient inversion attack (GIA), which aims to reconstruct the original training samples.
We propose Client-side poisoning Gradient Inversion (CGI), which is a novel attack method that can be launched from clients.
arXiv Detail & Related papers (2023-09-14T03:48:27Z) - Detection and Mitigation of Byzantine Attacks in Distributed Training [24.951227624475443]
An abnormal Byzantine behavior of the worker nodes can derail the training and compromise the quality of the inference.
Recent work considers a wide range of attack models and has explored robust aggregation and/or computational redundancy to correct the distorted gradients.
In this work, we consider attack models ranging from strong ones: $q$ omniscient adversaries with full knowledge of the defense protocol that can change from iteration to iteration to weak ones: $q$ randomly chosen adversaries with limited collusion abilities.
arXiv Detail & Related papers (2022-08-17T05:49:52Z) - Backdoor Attacks in Federated Learning by Rare Embeddings and Gradient
Ensembling [36.30908735595904]
This paper investigates the feasibility of model poisoning for backdoor attacks through textitrare word embeddings of NLP models in text classification and sequence-to-sequence tasks.
For a less complex dataset, a mere 0.1% of adversary clients is enough to poison the global model effectively.
We also propose a technique specialized in the federated learning scheme called gradient ensemble, which enhances the backdoor performance in all experimental settings.
arXiv Detail & Related papers (2022-04-29T11:17:05Z) - Byzantine-robust Federated Learning through Collaborative Malicious
Gradient Filtering [32.904425716385575]
We show that element-wise sign of gradient vector can provide valuable insight in detecting model poisoning attacks.
We propose a novel approach called textitSignGuard to enable Byzantine-robust federated learning through collaborative malicious gradient filtering.
arXiv Detail & Related papers (2021-09-13T11:15:15Z) - Aspis: A Robust Detection System for Distributed Learning [13.90938823562779]
Machine learning systems can be compromised when some of the computing devices exhibit abnormal (Byzantine) behavior.
Our proposed method Aspis assigns gradient computations to worker nodes using a subset-based assignment.
We prove the Byzantine resilience and detection guarantees of Aspis under weak and strong attacks and extensively evaluate the system on various large-scale training scenarios.
arXiv Detail & Related papers (2021-08-05T07:24:38Z) - Style Curriculum Learning for Robust Medical Image Segmentation [62.02435329931057]
Deep segmentation models often degrade due to distribution shifts in image intensities between the training and test data sets.
We propose a novel framework to ensure robust segmentation in the presence of such distribution shifts.
arXiv Detail & Related papers (2021-08-01T08:56:24Z) - Staircase Sign Method for Boosting Adversarial Attacks [123.19227129979943]
Crafting adversarial examples for the transfer-based attack is challenging and remains a research hot spot.
We propose a novel Staircase Sign Method (S$2$M) to alleviate this issue, thus boosting transfer-based attacks.
Our method can be generally integrated into any transfer-based attacks, and the computational overhead is negligible.
arXiv Detail & Related papers (2021-04-20T02:31:55Z) - Simeon -- Secure Federated Machine Learning Through Iterative Filtering [74.99517537968161]
Federated learning enables a global machine learning model to be trained collaboratively by distributed, mutually non-trusting learning agents.
A global model is distributed to clients, who perform training, and submit their newly-trained model to be aggregated into a superior model.
A class of Byzantine-tolerant aggregation algorithms has emerged, offering varying degrees of robustness against these attacks.
This paper presents Simeon: a novel approach to aggregation that applies a reputation-based iterative filtering technique.
arXiv Detail & Related papers (2021-03-13T12:17:47Z) - Accumulated Decoupled Learning: Mitigating Gradient Staleness in
Inter-Layer Model Parallelization [16.02377434191239]
We propose an accumulated decoupled learning (ADL) which incorporates the gradient accumulation technique to mitigate the stale gradient effect.
We prove that the proposed method can converge to critical points, i.e., the gradients converge to 0, in spite of its asynchronous nature.
The ADL is shown to outperform several state-of-the-arts in the classification tasks, and is the fastest among the compared methods.
arXiv Detail & Related papers (2020-12-03T11:52:55Z) - Boosting Gradient for White-Box Adversarial Attacks [60.422511092730026]
We propose a universal adversarial example generation method, called ADV-ReLU, to enhance the performance of gradient based white-box attack algorithms.
Our approach calculates the gradient of the loss function versus network input, maps the values to scores, and selects a part of them to update the misleading gradients.
arXiv Detail & Related papers (2020-10-21T02:13:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.