Related papers: Detecting Backdoor Attacks in Federated Learning via Direction Alignment Inspection

Detecting Backdoor Attacks in Federated Learning via Direction Alignment Inspection

URL: http://arxiv.org/abs/2503.07978v2
Date: Tue, 18 Mar 2025 22:09:36 GMT
Title: Detecting Backdoor Attacks in Federated Learning via Direction Alignment Inspection
Authors: Jiahao Xu, Zikai Zhang, Rui Hu,
Abstract summary: Federated Learning (FL) systems are vulnerable to malicious model updates.<n>We introduce AlignIns, a novel defense method designed to safeguard FL systems against backdoor attacks.<n>We show that AlignIns achieves higher robustness compared to the state-of-the-art defense methods.
Score: 7.200910949076064
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The distributed nature of training makes Federated Learning (FL) vulnerable to backdoor attacks, where malicious model updates aim to compromise the global model's performance on specific tasks. Existing defense methods show limited efficacy as they overlook the inconsistency between benign and malicious model updates regarding both general and fine-grained directions. To fill this gap, we introduce AlignIns, a novel defense method designed to safeguard FL systems against backdoor attacks. AlignIns looks into the direction of each model update through a direction alignment inspection process. Specifically, it examines the alignment of model updates with the overall update direction and analyzes the distribution of the signs of their significant parameters, comparing them with the principle sign across all model updates. Model updates that exhibit an unusual degree of alignment are considered malicious and thus be filtered out. We provide the theoretical analysis of the robustness of AlignIns and its propagation error in FL. Our empirical results on both independent and identically distributed (IID) and non-IID datasets demonstrate that AlignIns achieves higher robustness compared to the state-of-the-art defense methods. The code is available at https://github.com/JiiahaoXU/AlignIns.

Related papers

TrojanDam: Detection-Free Backdoor Defense in Federated Learning through Proactive Model Robustification utilizing OOD Data [9.078056680217465]
decentralization allows adversaries to craft carefully designed backdoor updates to misclassify a global model. Existing defense mechanisms mainly rely on post-training detection after receiving updates. We propose a backdoor defense paradigm, which focuses on proactive robustification on the global model.
arXiv Detail & Related papers (2025-04-22T07:56:51Z)
Identify Backdoored Model in Federated Learning via Individual Unlearning [7.200910949076064]
Backdoor attacks present a significant threat to the robustness of Federated Learning (FL) We propose MASA, a method that utilizes individual unlearning on local models to identify malicious models in FL. To the best of our knowledge, this is the first work to leverage machine unlearning to identify malicious models in FL.
arXiv Detail & Related papers (2024-11-01T21:19:47Z)
Mitigating Malicious Attacks in Federated Learning via Confidence-aware Defense [3.685395311534351]
Federated Learning (FL) is a distributed machine learning diagram that enables multiple clients to collaboratively train a global model without sharing their private local data. FL systems are vulnerable to attacks that are happening in malicious clients through data poisoning and model poisoning. Existing defense methods typically focus on mitigating specific types of poisoning and are often ineffective against unseen types of attack.
arXiv Detail & Related papers (2024-08-05T20:27:45Z)
Concealing Backdoor Model Updates in Federated Learning by Trigger-Optimized Data Poisoning [20.69655306650485]
Federated Learning (FL) is a decentralized machine learning method that enables participants to collaboratively train a model without sharing their private data. Despite its privacy and scalability benefits, FL is susceptible to backdoor attacks. We propose DPOT, a backdoor attack strategy in FL that dynamically constructs backdoor objectives by optimizing a backdoor trigger.
arXiv Detail & Related papers (2024-05-10T02:44:25Z)
FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks. FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain. We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z)
Data-Agnostic Model Poisoning against Federated Learning: A Graph Autoencoder Approach [65.2993866461477]
This paper proposes a data-agnostic, model poisoning attack on Federated Learning (FL) The attack requires no knowledge of FL training data and achieves both effectiveness and undetectability. Experiments show that the FL accuracy drops gradually under the proposed attack and existing defense mechanisms fail to detect it.
arXiv Detail & Related papers (2023-11-30T12:19:10Z)
Learn from the Past: A Proxy Guided Adversarial Defense Framework with Self Distillation Regularization [53.04697800214848]
Adversarial Training (AT) is pivotal in fortifying the robustness of deep learning models. AT methods, relying on direct iterative updates for target model's defense, frequently encounter obstacles such as unstable training and catastrophic overfitting. We present a general proxy guided defense framework, LAST' (bf Learn from the Pbf ast)
arXiv Detail & Related papers (2023-10-19T13:13:41Z)
Avoid Adversarial Adaption in Federated Learning by Multi-Metric Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources. FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks. We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously. MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z)
Mitigating Backdoors in Federated Learning with FLD [7.908496863030483]
Federated learning allows clients to collaboratively train a global model without uploading raw data for privacy preservation. This feature has recently been found responsible for federated learning's vulnerability in the face of backdoor attacks. We propose Federated Layer Detection (FLD), a novel model filtering approach for effectively defending against backdoor attacks.
arXiv Detail & Related papers (2023-03-01T07:54:54Z)
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients) We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness. Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z)
Federated Zero-Shot Learning for Visual Recognition [55.65879596326147]
We propose a novel Federated Zero-Shot Learning FedZSL framework. FedZSL learns a central model from the decentralized data residing on edge devices. The effectiveness and robustness of FedZSL are demonstrated by extensive experiments conducted on three zero-shot benchmark datasets.
arXiv Detail & Related papers (2022-09-05T14:49:34Z)
Backdoor Defense in Federated Learning Using Differential Testing and Outlier Detection [24.562359531692504]
We propose DifFense, an automated defense framework to protect an FL system from backdoor attacks. Our detection method reduces the average backdoor accuracy of the global model to below 4% and achieves a false negative rate of zero.
arXiv Detail & Related papers (2022-02-21T17:13:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.