Identifying Backdoor Attacks in Federated Learning via Anomaly Detection
- URL: http://arxiv.org/abs/2202.04311v2
- Date: Wed, 23 Aug 2023 16:17:40 GMT
- Title: Identifying Backdoor Attacks in Federated Learning via Anomaly Detection
- Authors: Yuxi Mi, Yiheng Sun, Jihong Guan, Shuigeng Zhou
- Abstract summary: Federated learning is vulnerable to backdoor attacks.
This paper proposes an effective defense against the attack by examining shared model updates.
We demonstrate through extensive analyses that our proposed methods effectively mitigate state-of-the-art backdoor attacks.
- Score: 31.197488921578984
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated learning has seen increased adoption in recent years in response to
the growing regulatory demand for data privacy. However, the opaque local
training process of federated learning also sparks rising concerns about model
faithfulness. For instance, studies have revealed that federated learning is
vulnerable to backdoor attacks, whereby a compromised participant can
stealthily modify the model's behavior in the presence of backdoor triggers.
This paper proposes an effective defense against the attack by examining shared
model updates. We begin with the observation that the embedding of backdoors
influences the participants' local model weights in terms of the magnitude and
orientation of their model gradients, which can manifest as distinguishable
disparities. We enable a robust identification of backdoors by studying the
statistical distribution of the models' subsets of gradients. Concretely, we
first segment the model gradients into fragment vectors that represent small
portions of model parameters. We then employ anomaly detection to locate the
distributionally skewed fragments and prune the participants with the most
outliers. We embody the findings in a novel defense method, ARIBA. We
demonstrate through extensive analyses that our proposed methods effectively
mitigate state-of-the-art backdoor attacks with minimal impact on task utility.
Related papers
- Backdoor Defense through Self-Supervised and Generative Learning [0.0]
Training on such data injects a backdoor which causes malicious inference in selected test samples.
This paper explores an approach based on generative modelling of per-class distributions in a self-supervised representation space.
In both cases, we find that per-class generative models allow to detect poisoned data and cleanse the dataset.
arXiv Detail & Related papers (2024-09-02T11:40:01Z) - Mitigating Backdoor Attacks using Activation-Guided Model Editing [8.00994004466919]
Backdoor attacks compromise the integrity and reliability of machine learning models.
We propose a novel backdoor mitigation approach via machine unlearning to counter such backdoor attacks.
arXiv Detail & Related papers (2024-07-10T13:43:47Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features.
backdoor attacks subtly embed malicious behaviors within the model during training.
We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z) - Model Pairing Using Embedding Translation for Backdoor Attack Detection on Open-Set Classification Tasks [63.269788236474234]
We propose to use model pairs on open-set classification tasks for detecting backdoors.
We show that this score, can be an indicator for the presence of a backdoor despite models being of different architectures.
This technique allows for the detection of backdoors on models designed for open-set classification tasks, which is little studied in the literature.
arXiv Detail & Related papers (2024-02-28T21:29:16Z) - Shared Adversarial Unlearning: Backdoor Mitigation by Unlearning Shared
Adversarial Examples [67.66153875643964]
Backdoor attacks are serious security threats to machine learning models.
In this paper, we explore the task of purifying a backdoored model using a small clean dataset.
By establishing the connection between backdoor risk and adversarial risk, we derive a novel upper bound for backdoor risk.
arXiv Detail & Related papers (2023-07-20T03:56:04Z) - STDLens: Model Hijacking-Resilient Federated Learning for Object
Detection [13.895922908738507]
Federated Learning (FL) has been gaining popularity as a collaborative learning framework to train deep learning-based object detection models over a distributed population of clients.
Despite its advantages, FL is vulnerable to model hijacking.
This paper introduces STDLens, a principled approach to safeguarding FL against such attacks.
arXiv Detail & Related papers (2023-03-21T00:15:53Z) - Mitigating Backdoors in Federated Learning with FLD [7.908496863030483]
Federated learning allows clients to collaboratively train a global model without uploading raw data for privacy preservation.
This feature has recently been found responsible for federated learning's vulnerability in the face of backdoor attacks.
We propose Federated Layer Detection (FLD), a novel model filtering approach for effectively defending against backdoor attacks.
arXiv Detail & Related papers (2023-03-01T07:54:54Z) - Sampling Attacks: Amplification of Membership Inference Attacks by
Repeated Queries [74.59376038272661]
We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
arXiv Detail & Related papers (2020-09-01T12:54:54Z) - Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types.
In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.