SABRE-FL: Selective and Accurate Backdoor Rejection for Federated Prompt Learning
- URL: http://arxiv.org/abs/2506.22506v1
- Date: Wed, 25 Jun 2025 23:15:20 GMT
- Title: SABRE-FL: Selective and Accurate Backdoor Rejection for Federated Prompt Learning
- Authors: Momin Ahmad Khan, Yasra Chandio, Fatima Muhammad Anwar,
- Abstract summary: We present the first study of backdoor attacks in Federated Prompt Learning.<n>We show that when malicious clients inject visually imperceptible, learnable noise triggers into input images, the global prompt learner becomes vulnerable to targeted misclassification.<n>Motivated by this vulnerability, we propose SABRE-FL, a lightweight, modular defense that filters poisoned prompt updates using an embedding-space anomaly detector trained offline on out-of-distribution data.
- Score: 1.3312007032203859
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Federated Prompt Learning has emerged as a communication-efficient and privacy-preserving paradigm for adapting large vision-language models like CLIP across decentralized clients. However, the security implications of this setup remain underexplored. In this work, we present the first study of backdoor attacks in Federated Prompt Learning. We show that when malicious clients inject visually imperceptible, learnable noise triggers into input images, the global prompt learner becomes vulnerable to targeted misclassification while still maintaining high accuracy on clean inputs. Motivated by this vulnerability, we propose SABRE-FL, a lightweight, modular defense that filters poisoned prompt updates using an embedding-space anomaly detector trained offline on out-of-distribution data. SABRE-FL requires no access to raw client data or labels and generalizes across diverse datasets. We show, both theoretically and empirically, that malicious clients can be reliably identified and filtered using an embedding-based detector. Across five diverse datasets and four baseline defenses, SABRE-FL outperforms all baselines by significantly reducing backdoor accuracy while preserving clean accuracy, demonstrating strong empirical performance and underscoring the need for robust prompt learning in future federated systems.
Related papers
- Hear No Evil: Detecting Gradient Leakage by Malicious Servers in Federated Learning [35.64232606410778]
gradient updates in federated learning can unintentionally reveal sensitive information about a client's local data.<n>This paper provides the first comprehensive analysis of malicious gradient leakage attacks and the model manipulation techniques that enable them.<n>We propose a simple, lightweight, and broadly applicable client-side detection mechanism that flags suspicious model updates before local training begins.
arXiv Detail & Related papers (2025-06-25T17:49:26Z) - Robust Anti-Backdoor Instruction Tuning in LVLMs [53.766434746801366]
We introduce a lightweight, certified-agnostic defense framework for large visual language models (LVLMs)<n>Our framework finetunes only adapter modules and text embedding layers under instruction tuning.<n>Experiments against seven attacks on Flickr30k and MSCOCO demonstrate that ours reduces their attack success rate to nearly zero.
arXiv Detail & Related papers (2025-06-04T01:23:35Z) - Defending the Edge: Representative-Attention for Mitigating Backdoor Attacks in Federated Learning [7.808916974942399]
heterogeneous edge devices produce diverse, non-independent, and identically distributed (non-IID) data.<n>We propose a novel representative-attention-based defense mechanism, named FeRA, to distinguish benign from malicious clients.<n>Our evaluation demonstrates FeRA's robustness across various FL scenarios, including challenging non-IID data distributions typical of edge devices.
arXiv Detail & Related papers (2025-05-15T13:44:32Z) - Lie Detector: Unified Backdoor Detection via Cross-Examination Framework [68.45399098884364]
We propose a unified backdoor detection framework in the semi-honest setting.<n>Our method achieves superior detection performance, improving accuracy by 5.4%, 1.6%, and 11.9% over SoTA baselines.<n> Notably, it is the first to effectively detect backdoors in multimodal large language models.
arXiv Detail & Related papers (2025-03-21T06:12:06Z) - URVFL: Undetectable Data Reconstruction Attack on Vertical Federated Learning [9.017014896207442]
Existing malicious attacks alter the underlying VFL training task, and are easily detected by comparing the received gradients with the ones received in honest training.<n>We develop URVFL, a novel attack strategy that evades current detection mechanisms.<n>Our comprehensive experiments demonstrate that URVFL significantly outperforms existing attacks, and successfully circumvents SOTA detection methods for malicious attacks.
arXiv Detail & Related papers (2024-04-30T14:19:06Z) - Client-side Gradient Inversion Against Federated Learning from Poisoning [59.74484221875662]
Federated Learning (FL) enables distributed participants to train a global model without sharing data directly to a central server.
Recent studies have revealed that FL is vulnerable to gradient inversion attack (GIA), which aims to reconstruct the original training samples.
We propose Client-side poisoning Gradient Inversion (CGI), which is a novel attack method that can be launched from clients.
arXiv Detail & Related papers (2023-09-14T03:48:27Z) - Attribute Inference Attack of Speech Emotion Recognition in Federated
Learning Settings [56.93025161787725]
Federated learning (FL) is a distributed machine learning paradigm that coordinates clients to train a model collaboratively without sharing local data.
We propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters.
We show that the attribute inference attack is achievable for SER systems trained using FL.
arXiv Detail & Related papers (2021-12-26T16:50:42Z) - RoFL: Attestable Robustness for Secure Federated Learning [59.63865074749391]
Federated Learning allows a large number of clients to train a joint model without the need to share their private data.
To ensure the confidentiality of the client updates, Federated Learning systems employ secure aggregation.
We present RoFL, a secure Federated Learning system that improves robustness against malicious clients.
arXiv Detail & Related papers (2021-07-07T15:42:49Z) - CRFL: Certifiably Robust Federated Learning against Backdoor Attacks [59.61565692464579]
This paper provides the first general framework, Certifiably Robust Federated Learning (CRFL), to train certifiably robust FL models against backdoors.
Our method exploits clipping and smoothing on model parameters to control the global model smoothness, which yields a sample-wise robustness certification on backdoors with limited magnitude.
arXiv Detail & Related papers (2021-06-15T16:50:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.