Related papers: DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning

DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning

URL: http://arxiv.org/abs/2411.12220v1
Date: Tue, 19 Nov 2024 04:12:14 GMT
Title: DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning
Authors: Kichang Lee, Yujin Shin, Jonghyuk Yun, Jun Han, JeongGil Ko,
Abstract summary: Federated Learning (FL) enables collaborative model training across distributed devices while preserving local data privacy, making it ideal for mobile and embedded systems. However, the decentralized nature of FL also opens vulnerabilities to model poisoning attacks, particularly backdoor attacks. We propose DeTrigger, a scalable and efficient backdoor-robust federated learning framework.
Score: 4.932796168357307
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Federated Learning (FL) enables collaborative model training across distributed devices while preserving local data privacy, making it ideal for mobile and embedded systems. However, the decentralized nature of FL also opens vulnerabilities to model poisoning attacks, particularly backdoor attacks, where adversaries implant trigger patterns to manipulate model predictions. In this paper, we propose DeTrigger, a scalable and efficient backdoor-robust federated learning framework that leverages insights from adversarial attack methodologies. By employing gradient analysis with temperature scaling, DeTrigger detects and isolates backdoor triggers, allowing for precise model weight pruning of backdoor activations without sacrificing benign model knowledge. Extensive evaluations across four widely used datasets demonstrate that DeTrigger achieves up to 251x faster detection than traditional methods and mitigates backdoor attacks by up to 98.9%, with minimal impact on global model accuracy. Our findings establish DeTrigger as a robust and scalable solution to protect federated learning environments against sophisticated backdoor threats.

Related papers

Sealing The Backdoor: Unlearning Adversarial Text Triggers In Diffusion Models Using Knowledge Distillation [3.54387829918311]
Adversaries can inject imperceptible textual triggers into training data, causing models to generate manipulated outputs.<n>We propose Self-Knowledge Distillation with Cross-Attention Guidance (SKD-CAG) to erase associations between adversarial text triggers and poisoned outputs.<n>Our method achieves removal accuracy 100% for pixel backdoors and 93% for style-based attacks, without sacrificing robustness or image fidelity.
arXiv Detail & Related papers (2025-08-20T00:57:21Z)
Lie Detector: Unified Backdoor Detection via Cross-Examination Framework [68.45399098884364]
We propose a unified backdoor detection framework in the semi-honest setting. Our method achieves superior detection performance, improving accuracy by 5.4%, 1.6%, and 11.9% over SoTA baselines. Notably, it is the first to effectively detect backdoors in multimodal large language models.
arXiv Detail & Related papers (2025-03-21T06:12:06Z)
Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning. This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities. In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z)
BadGD: A unified data-centric framework to identify gradient descent vulnerabilities [10.996626204702189]
BadGD sets a new standard for understanding and mitigating adversarial manipulations. This research underscores the severe threats posed by such data-centric attacks and highlights the urgent need for robust defenses in machine learning.
arXiv Detail & Related papers (2024-05-24T23:39:45Z)
Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features. backdoor attacks subtly embed malicious behaviors within the model during training. We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z)
FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks. FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain. We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z)
Data-Agnostic Model Poisoning against Federated Learning: A Graph Autoencoder Approach [65.2993866461477]
This paper proposes a data-agnostic, model poisoning attack on Federated Learning (FL) The attack requires no knowledge of FL training data and achieves both effectiveness and undetectability. Experiments show that the FL accuracy drops gradually under the proposed attack and existing defense mechanisms fail to detect it.
arXiv Detail & Related papers (2023-11-30T12:19:10Z)
IMBERT: Making BERT Immune to Insertion-based Backdoor Attacks [45.81957796169348]
Backdoor attacks are an insidious security threat against machine learning models. We introduce IMBERT, which uses either gradients or self-attention scores derived from victim models to self-defend against backdoor attacks. Our empirical studies demonstrate that IMBERT can effectively identify up to 98.5% of inserted triggers.
arXiv Detail & Related papers (2023-05-25T22:08:57Z)
Mitigating Backdoors in Federated Learning with FLD [7.908496863030483]
Federated learning allows clients to collaboratively train a global model without uploading raw data for privacy preservation. This feature has recently been found responsible for federated learning's vulnerability in the face of backdoor attacks. We propose Federated Layer Detection (FLD), a novel model filtering approach for effectively defending against backdoor attacks.
arXiv Detail & Related papers (2023-03-01T07:54:54Z)
Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure. We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z)
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients) We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness. Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z)
Backdoor Defense in Federated Learning Using Differential Testing and Outlier Detection [24.562359531692504]
We propose DifFense, an automated defense framework to protect an FL system from backdoor attacks. Our detection method reduces the average backdoor accuracy of the global model to below 4% and achieves a false negative rate of zero.
arXiv Detail & Related papers (2022-02-21T17:13:03Z)
Identifying Backdoor Attacks in Federated Learning via Anomaly Detection [31.197488921578984]
Federated learning is vulnerable to backdoor attacks. This paper proposes an effective defense against the attack by examining shared model updates. We demonstrate through extensive analyses that our proposed methods effectively mitigate state-of-the-art backdoor attacks.
arXiv Detail & Related papers (2022-02-09T07:07:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.